UFDC Home  myUFDC Home  Help 



Full Text  
ANONYMITY AND COVERT CHANNELS IN MIXFIREWALLS By VIPAN REDDY R. NALLA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2004 Copyright 2004 by Vipan Reddy R. Nalla ACKNOWLEDGMENTS I would like to gratefully acknowledge the great supervision of Dr. Richard Newman during this work. I thank Dr. Joseph Wilson and Dr. Shigang C('!. 1 for serving on my committee and for reviewing my work. I would like to thank Ira Moskowitz and N ,. I Research Labs for funding me through research grants. I am grateful to all my friends who helped me directly or indirectly in preparing this work. Finally, I am forever indebted to my parents for helping me to reach this stage in my life. TABLE OF CONTENTS page ACKNOW LEDGMENTS ....... ........................ iii LIST OF FIGURES ........ .......................... vi ABSTRACT ............... ............ ........ viii 1 INTRODUCTION ........ ............. .......... 1 2 MIXES AND MIX NETWORKS ....... .................. 3 2.1 Mix ............................. .... .. .. 3 2.2 Types of M ixes .. .. .. .. .. .. ... .. .. .. .. .. .. 3 2.2.1 Simple M ixes ........ .................... 3 2.2.2 Pool Mixes ....... ......... ...... ..... 5 2.3 M ix Networks .. .. .. . .. ... .. .. .. .. .. .... 6 2.3.1 Design Issues in Mix Networks .. ..... .......... 6 2.3.2 Classification of Mix Networks .. .... ........... 9 2.4 Realtime Mix Networks .. .... ............ ..... 10 2.4.1 Crowds ........ ........ ...... ...... 10 2.4.2 Onion Routing ........ ........... ....... 11 2.4.3 Babel ................. ............. 11 2.4.4 MixMaster ........ ........... ........ 12 2.4.5 Freedom ............... ......... .. .. 13 2.4.6 PipeNet ............... ......... .. .. 13 2.4.7 StopAndGo Mixes .............. ... .. .. 14 2.4.8 Tarzan ..... ............ .......... 14 2.5 Summary .................. ........... .. .. 15 3 ADVERSARY MODELS AND ATTACKS ON MIXES . . ... 16 3.1 Adversary Models ............ . . ... 16 3.1.1 Internal and External Adversary ..... . . 16 3.1.2 Active and Passive Adversary ............... .. 16 3.1.3 Local, Restricted and Global Adversary . . 16 3.1.4 Static and Adaptive Adversary .................. .. 17 3.2 Attacks on Mixes .................. ........... .. 17 3.2.1 Active Attacks .................. ......... .. 17 3.2.2 Passive Attacks .................. ...... .. .. 20 3.3 Summary .................. ............... .. 22 4 ANONYMITY METRICS AND ANALYSIS TECHNIQUE . . ... 23 4.1 Anonymity .................. .............. .. 23 4.2 Anonymity Metrics .................. .......... .. 24 4.2.1 Anonymity Sets .................. ........ .. 24 iv 4.2.2 Problems with 4.2.3 Entropy . 4.2.4 Route Length 4.2.5 Covert C'!i ., 4.2.6 Covert C'!i i, 4.2.7 Covert ('C! i, 4.3 Analysis Technique . 4.3.1 Scenarios . 4.3.2 C' i, ,, I M atri: 4.4 Summary ...... Anonymity Set Size . l . . . . I in Mix Networks . I Capacity as Anonymity x . . . . 5 PREVIOUS WORK AND THE EXITMIX MODEL .. ........... 5.1 Capacity Analysis for Indistinguishable Receivers Case .. ....... 5.1.1 Case 0: Alice Alone . . . . . . . 5.1.2 Case 1: Alice and One Additional Clueless Transmitter... 5.1.3 Case 2: Alice and N Additional Transmitters .. ......... 5.2 ExitM ix M odel .. ................ 5.2.1 Scenario . . . . . 5.2.2 C('! I ,,,! I Matrix Probabilities ........ 5.3 Capacity Analysis for ExitMIX Scenario ..... 5.3.1 One Receiver (M = 1) ............ 5.3.2 Some Special Cases for Two Receivers (Mf 5.3.3 Some Special Cases for Three Receivers (M 5.3.4 Some Generalized Cases of N and M . 5.3.5 NonUniform Message Distributions . 5.4 Sum m ary . . . . . . 2) 3) 6 DISCUSSION OF RESULTS .. ...................... 6.1 Capacity vs. Clueless Transmitters . 6.2 Capacity vs. Number of Receivers . 6.3 Capacity vs. Mutual Information at xo : 6.4 Capacity vs. Message Distributions . 6.5 Comments and Generalizations . 6.6 Summary .. ............. 7 CONCLUSIONS AND FUTURE WORK .. 1/(M + 1) REFERENCES ...................................... BIOGRAPHICAL SKETCH ............ Metric LIST OF FIGURES Figure page 41 Vulnerability of Anonymity Sets ................ ..... 26 42 Restricted Passive Adversary Model ................ ....... 32 43 Global Passive Adversary Model ................ ..... 33 51 ('!,i i ,, I Model for Subsection 5.1.1. A) ('C! iho,,, I block diagram. B) C'!h i1 nel transition diagram .................. ......... .. 38 52 Plot of Covert C'!i i n., I Capacity as a Function of p ............ ..40 53 ('!,i iii, I for Case 3, the general case of N clueless users. A) ('C! i, I tran sition diagram. B) ('C1 ,ii, I Matrix ................ .. .. 42 54 Exit Mixfirewall Model with N Clueless Senders and M Distinguishable Receivers .................. ................ .. 44 55 Case 4: System with N = 1 Clueless Sender and M = 2 Receivers . 48 56 Capacity for N = 1 Clueless Sender and M = 2 Receivers . ... 49 57 Case 5: System with N = 2 Clueless Senders and M = 2 Receivers . 50 58 Capacity for N = 2 clueless senders and M = 2 receivers . ... 52 59 Case 6: System with N = 1 Clueless Senders and M = 3 Receivers . 52 510 Capacity for N = 1 clueless sender and M = 3 receivers . ... 53 511 Capacity for N = 2 clueless senders and M = 3 receivers . ... 55 512 Case 7: System With N = 2 Clueless Senders and M = 3 Receivers . 56 513 Case 8: System with N = 1 Clueless Sender and M Receivers . ... 56 514 Case 9: System with N Clueless Senders and M = 2 Receivers ...... ..59 61 Capacity for N = 1 to 4 Clueless Senders and M = 2 Receivers ...... ..66 62 Capacity for N = 1, 2, 4 Clueless Senders and M = 3 Receivers . 66 63 Mutual Information vs. x0 for N = 1 Clueless Sender and M = 2 Re ceivers, for p 0.25, 0.33, 0.5, 0.67 ................ .. ... 67 64 Mutual Information vs. p for N = 2 Clueless Senders and M = 2 Receivers 67 65 Mutual Information vs. p for N = 2 Clueless Senders and M = 3 Receivers 68 66 Value of x0 that Maximizes Mutual Information for N = 2, 3, 4 Clueless Senders and M = 3 Receivers as a Function of p . . ..... 69 67 Normalized Mutual Information when xo = 1/4 for N = 1, 2, 3, 4 Clueless Senders and M = 3 Receivers ................ ..... 70 68 Capacity for N = 1 Clueless Sender and M = 1 to 5 Receivers ...... ..70 69 Capacity for N = 0 to 9 Clueless Senders and M = 1 to 10 . . 71 610 Capacity for Uniform, Zipf, and 80/20 Distributions for Clueless Trans mitter and Uniform Distribution for Clueless Transmitter . ... 72 611 Capacity for Uniform, Zipf, and 80/20 Distributions for Alice and Uni form Distribution for Clueless Transmitter .... . . 73 612 Capacity for Uniform, Zipf, and 80/20 distributions for Alice and Zipf Distribution for Clueless Transmitter ............. .. 73 Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science ANONYMITY AND COVERT CHANNELS IN MIXFIREWALLS By Vipan Reddy R. Nalla December 2004 C('! i: Richard E. Newman Major Department: Computer and Information Science and Engineering Privacy is becoming a critical issue on the Internet. Some people want to keep their purchases private. They do not want to have third parties (or even merchants) know their identity. This concern may arise because the customer is buying a good of questionable social value (e.g., pornography); or because the customer does not want to have his name added to a marketing or mailing list; or for illegal reasons (e.g., to evade taxes); or simply because the customer personally values privacy. Mix networks are the most promising approach to anonymize communication in the Internet. Originally designed to anonymize email communication, variations of the basic design have led to systems that provide anonymity for lowlatency applications such as web browsing. Traditional methods for evaluating the amount of anonymity afforded by various mix configurations have depended on either measuring the size of the set of possible senders of a particular message (the anonymity set size), or by measuring the entropy associated with the probability distribution of the messages of possible senders. Our study further explores an alternative way of assessing the .ii. .ivymity of a mix system by considering the capacity of a covert channel from a sender behind the mix to an observer of the mix's output. CHAPTER 1 INTRODUCTION Privacy is becoming a critical issue on the Internet. Some people want to keep their purchases private. They do not want to have third parties (or even merchants) know their identity. This concern may arise because the customer is buying a good of questionable social value (e.g., pornography); or because the customer does not want to have his name added to a marketing or mailing list; or for illegal reasons (e.g., to evade taxes); or simply because the customer personally values privacy. Elections constantly remind us that one of the most important barriers to electronic voting is users' fear of having their privacy violated. Unfortunately, this is justified, as marketers and national security agencies have been very ...ressive in monitoring user activity. Mix networks [3] are the most promising approach to anonymize communication in the Internet. Originally designed to anonymize email communication, variations of the basic design have led to systems that provide anonymity for lowlatency applications such as web browsing. All these .i,:,. :vmity networks were not designed with covert channel threat in mind. The goal of this work is to show that even in what appears to be a benign form of communication, information may still leak out of the network. Overview.Our study addressed anonymity and covert channels. The in i i"r con tribution of our study is identification, analysis, and capacity estimation of, the covert channels that arise from the use of a Mix [3, 21] as an exit firewall. Mixes are special nodes in a network that relay messages while hiding the cor respondence between their input and their output. A careful explanation of mixes and a detailed classification of mixes is presented in chapter 2. Several mixes can be chained to relay a message ..ii:, ,vmously. These systems provide the best compromise between security and efficiency in terms of bandwidth, latency, and overheads. Design issues related to mix networks are also presented along with examples of some realtime mixbased anonymizing systems. C'!I ipter 3 presents various adversary models, followed by a comprehensive listing of attacks against mixes and mix networks. Anonymity is an important issue in electronic p ,iments, electronic auctions, electronic voting, and also for email and web browsing. A communication can never be truly .il rnlmvious, but relative anonymity can be achieved. ('!, lpter 4 defines anonymity and presents various types anonymity. It also describes generalized methods to measure anonymity and the technique used for analysis. We measured the lack of perfect anonymity via a covert channel. Covert channel analysis includes finding security flaw, development of covert channel scenarios and its capacity analysis. ('! Ilpter 4 gives a brief description of a particular flavor of covert channels arising in mix networks. C'! Ilpter 5 presents adversary model with details of terminology and model setup. It also presents initial work involving a simple model [13] with a restricted adversary (RPA), along with results and conclusions. It also presents the main analysis done in the thesis. This includes analyzing the capacity of the covert channels for different cases of sends and receivers. A detailed discussion of results of this analysis form the C'! Ilpter 6. C'! Ilpter 7 presents conclusions and i.i ; future work, needed in this area. CHAPTER 2 MIXES AND MIX NETWORKS 2.1 Mix David C'!i li, first introduced mix networks for untraceable electronic mail [3]. A mix server randomly permutes and decrypts input messages. The Key property of the mix network is that we can't tell which ciphertext corresponds to a given message. C'!i ii"i's original system used a very simple threshold mix model, but since then many different types of mixes have been proposed in literature, and some of them are being used in practice. A mix server is classified by the watching strategy used. The watching strategy involves collecting messages, mixing them well, and flushing the messages when certain conditions are met. The flushing algorithm used in the mix can be expressed as a function P : N  (0, 1) from the number of messages inside the mix to the fraction of messages to be flushed. The flushing condition is expressed in terms of time interval t, threshold of messages n collected in the mix, or a combination of both. 2.2 Types of Mixes Based on the flushing algorithm used, mixes can be divided into simple mixes and pool mixes. 2.2.1 Simple Mixes A simple mix flushes all the messages it contain, when the flushing conditions are met. Hence, the value of the function P(n) is equal to one. These mixes can be further classified, depending on the flushing condition used. Threshold mix. Flushing Condition Parameters: threshold on messages collected in the mix, n. Flushing Algorithm: the mix fires all the messages when n messages are collected. Message delay: The minimum delay is c (this happens when mix already con tained n1 messages before the target message arrives). The maximum delay can be infinite, if no more messages arrive after the target message. Assuming a message arrival rate r, the average message delay is given by 2. 3 * Anonymity: Assuming all the messages in the mix are from different senders and go to different receivers, the probability that an outgoing message corresponds to a particular incoming message is given by . This probability aliv, equal to 1 since the threshold n is constant. Timed mix. * Flushing Condition Parameters: time interval, t. * Flushing Algorithm: The mix flushes (all the messages in the mix) every t time units (generally seconds). * Message delay: The minimum delay is c, when the target message arrives just before the flushing time period of the mix. The maximum delay is t c, when the target message arrives just after the mix has fired. Hence, the mean d,1 iv is t units. * Anonymity: The. ., .ir,mity of the mix depends on the number of messages arriving in a particular flushing interval. The minimum anonymity is zero, when no message arrives in the time interval. The maximum anonymity is theoretically infinite, but is limited to the number of messages the mix can hold. Assuming a message arrival rate of r, a total of rt messages are fired. So the probability of an outgoing message corresponds to a particular incoming message is given by I. Threshold or timed mix. * Flushing Condition Parameters: time interval, t; threshold on messages, n. * Flushing Algorithm: The mix flushes (all the messages in the mix) every t time units (generally seconds) or when n messages accumulate in the mix. * Message delay: The minimum delay is c, when the target message arrives just before the flushing time period or when the mix already has n1 messages. The maximum delay is t c, when the target message arrives just after the mix has fired and number of messages arrived in the next interval is less than n. * Anonymity: The. ., .ir,mity of the mix depends on the number of messages arriving in a particular flushing interval. The minimum anonymity is zero, when no message arrives in the time interval. The maximum anonymity is not infinite as in the previous case because of the threshold n. The minimum probability of an outgoing message corresponds to a particular incoming message is given by 1 Threshold and timed mix. * Flushing Condition Parameters: time interval, t; threshold on messages, n. * Flushing Algorithm: The mix flushes (all the messages in the mix) every t time units (generally seconds) but only when at least n messages have accumulated in the mix. * Message delay: The minimum delay is c, when the target message arrives just before the flushing time period. The maximum delay can be infinite, if number of messages accumulated is less than n. * Anonymity: The minimum anonymity for this mix is no more zero, since the mix doesn't fire until it has n messages. The maximum anonymity is in theory infinite, but is limited in practice by the number of messages the mix can hold. The maximum probability of an outgoing message corresponds to a particular incoming message is given by 1 2.2.2 Pool Mixes In pool mixes, the mix retains some messages and hence the value of the flushing function P(n) is less then one. Pool mixes can be further divided into constant and dynamic pool mixes, depending on whether the value of function P is constant over successive flushes by the mix. Constant pool mixes. The simple mixes described earlier can be modified to retain a constant pool of messages for the next round. Threshold pool mix. Flushing Condition Parameters: number of messages retained (pool), f; threshold on messages, n. Flushing Algorithm: The mix fires n messages when it accumulates n + f messages. The pool of messages to be retained (f) are uniformly chosen at random from the n + f messages collected in the mix. Message delay: The minimum delay is c and the maximum delay is theoretically infinite. Serjantov, Syverson and Dingledine[20] analyze the threshold pool mixes in detail. They calculate the mean delay by taking into account the fact that a message can be retained in the mix for arbitrary long time. The probability of a message being retained is a particular round is given by .f The mean delay is 1 + ( f) rounds. If the message arrives at a rate or r messages per time unit, the average delay is (1 + n )" Anonymity: The ,.ir:ymity of the message going through a pool mix depends on the entire history of events that happened in the mix. The minimum anonymity of the mix is at least equal to the simple threshold mix. Serjantov and N, i. iI [20] carried out the analysis and have calculated the maximum anonymity in terms of number of possible sets. Ama (1 f)log(n+ f)+ log(f) n n Timed pool mix. Flushing Condition Parameters: number of messages retained (pool), f; time interval, t. Flushing Algorithm: The mix fires every t time units. A pool of f messages chosen uniformly at random is retained in the mix. If there the number of messages accumulated is less than of equal to f, then the mix doesn't fire. Message delay: The minimum delay is c and the maximum delay is infinite (when no message arrives for a long time, the messages retained in the pool never leave the mix). Like in the threshold pool mix, there is a nonzero probability that a message is retained for arbitrarily long time. Dynamic pool mixes. Dynamic pool mixes are represented by the function P and this function can be modified to maximize the anonymity obtained. Cottrell mix [5] and Binomial mix [20] are some examples of dynamic pool mixes. Timed dynamic pool mix (Cottrell mix). Flushing Condition Parameters: number of messages retained(pool), f; time interval, t; a, fraction of messages to be sent; threshold, n. Flushing Algorithm: The mix fires every t time units, provided there are at least n + f messages in the mix; However, instead of firing n messages, it fires max(l, Lm a]) messages, where m + f is the number of messages in the mix (m > n). Message delay: Like the timed pool mix, the minimum delay is c. The maximum delay is at least as high as that of timed constant pool mix. The average d. 1 i depends on the future rate of arrival of the messages. Anonymity: The .,.rirymity provided by this mix is higher than the constant pool mixes. This is because as the the number of messages collected goes up, the a keeps the chance of message remaining in the dynamic pool mix constant. For a constant timed pool mix, this quantity decreases with increase in messages collected and in case of threshold pool mix, the mix has to flush frequently, hence reducing the chance of a message remaining in the mix per unit time. Binomial mix. Flushing Condition Parameters: time interval, t; threshold, n. Flushing Algorithm: We can imagine the flushing function P(n) as a probability. For all the messages collected, the mix tosses a coin. A head indicates that the message will be sent and a tail indicates it will remain in the mix. On an average, the number of messages sent, s = nP(n). s follows the well known binomial distribution with a variance equal to np(1 p), where is p is the result of the function P(n). Message delay: The minimum delay is c and maximum delay depends on the random binomial function P(n). Anonymity: The .r,. ir:,mity provided by the mix is much more than that of previously discussed mix types, this is because the attacker can't easily determine the number of messages in the mix, n by observing the value of s. 2.3 Mix Networks The chain of mixes from a client to a server is called anonymous tunnel or a mix network. A single encrypted connection is used to transport the data of multiple anonymous tunnels between two mixes. 2.3.1 Design Issues in Mix Networks A Mix Network is characterized by the type of anonymity provided, packet sizes, dummy traffic, routing, and the nodeflushing Algorithm used at individual nodes. We will discuss each of these issues briefly. Anonymity. Probably the most important design issue is that of anonymity versus pseudonymity. Pseudonymity mean that some node(s) knows the users pseudonym (it can't link a pseudonym with a realworld identity). Another option is to have the user be anonymous in the mix network but be pseudonymous in its dealings with other users (halfpseudonymity). Anonymity provides better security since if a pseudonym (nym) is linked with a user, all future uses of the nym can be linked to the user. But, pseudonymity has many other advantages when compared to complete anonymity. Pseudonymity provides the best of both worlds: privacy protection and accountability (and openness). Since pseudonyms (nyms) have a persistent nature, long term relationships and trust can be cultivated. Authentication (verifying that someone has the right to use the network) is easier with pseudonymity because C' iiniii ii blinding [4] needs to be used when using anonymity. Packet sizes. The messages (e.g. web requests/replies) are chopped in fixedlength packets and are delivered in a particular order lexicographicc etc.). This eliminates the traffic analysis at a mix based on the packet length. But in many situations, using different message sizes yield substantial performance improvements. For example TCP/IP connections require on average one small control packet for every two (large) data packets. It might be inefficient for small messages to be5 padded or large packets split up in order to get a message of the correct size. So, we have a tradeoff between security and performance: using more than one message size gives better performance but worse security. Dummy traffic. Dummy packets are normally introduced to reduce traffic pattern based attacks and to some extent other passive attacks discussed in 3.2.2. Dummy messages contain random bit strings and are indistinguishable from real packets. Messages can be introduced between two mixes between client and the first mix in a tunnel, between the client and the last mix in the tunnel, or endtoend dummies. This results in constant, bidirectional packet streams between any two mixnodes or the users and their entry node. Dummy traffic is often used in an unstructured manner in to the mixnetworks and might not be as effective as it could be, some studies [15, 16, 18, 26, 27] have discussed and analyzed the use of dummy traffic for traffic analysis prevention. If a mix node sends its message to less than t nodes, dummy messages should be sent in such a way that t nodes receive messages. The larger t, the harder it is to mount the brute search attacks and intersection attacks. Each mix node should send messages to at least t destinations outside the mix network (dummy messages should be used to fill the gaps). The larger t, the harder it is to mount the brute search attack. Furthermore, this technique also complicates attacks in which the adversary monitors the exit nodes. Dummy messages can also be used to randomize the users communication patterns by making the user to send dummy traffic to the entry node. The challenge here is to have good security and minimize the amount of dummy messages used. Finally, dummy messages could also be used to reduce the amount of time mes sages stay at a given node. It seems that waiting for s messages to enter a mix node before sending t (t > s ) has similar security properties as waiting to receive t messages before releasing them. This trick could be used to reduce the time messages wait at nodes [18]. Routing. Routing can be either static, in which a preassigned number routes are used, or dynamic, where the user chooses the nodes in his route randomly. For large Internet based systems especially, having the user choose the nodes in his route is a viable option because of the following reasons. The nodes and users must 1:.i,'." each other node, which might be impractical. Some servers are far from each other and it doesn't make sense from a perfor mance view point to have, for example, a route consisting of nodes in Australia, Canada, South Africa and C'ii!i , Nodes should be "socially" independent. Ideally, the nodes in a route should belong to different organizations and be located in different legal jurisdictions. The whole idea behind using more than one node is that none of them have enough information to determine senderrecipient matching. Hence, if all nodes in a route belong to the same organization we might as well just use a single node. The motivation for having nodes in different legal jurisdiction is that more than one subpoena needs to be obtained to compromise nodes legally. Normally, systems use static routes that allow mix nodes to associate each message with a connection identifier, which helps reducing the number of public key operations executed. But on the negative side, it is more susceptible to attacks because having fixed routes makes some of the attacks a lot easier to be carried out. Creating good network topologies and route finding algorithms with respect to security and efficiency is not a trivial task and needs lot of analysis on designer's part. NodeFlushing Algorithm. As seen in Section 2.2, there are many different ap proaches to flushing nodes. Again, there is a security/practicality tradeoff: the longer messages can stay in mixnodes the better the security (in most settings). more users (in the same anonymity set. The mix servers in any .ri:,lr. mous tunnel are not known to the adversary, in a particular order lexicographicc etc..) used to encrypt the mixnetworkinternal protocol headers between two .ildi i:ent mix servers. This defeats traffic on the pattern of packets. they are forwarded. This beats traffic analysis by looking at the sequence of incoming and outgoing packets strings and for an observer are indistinguishable from real packets. Messages can be introduced either between client and first mix in the tunnel or endtoend dummies between the client and the last mix in the constant, bidirectional packet streams between any two mixes or the clients and their first mix length of messages is no longer possible. 2.3.2 Classification of Mix Networks We can classify mix networks based on the number of servers as static mix networks and /;,i' /. mixnetworks. Static mixnetworks are made up of a relatively small number of highly available, powerful mixes with good network connectivity that serve a much larger number of users (e.g. 100 mixes, 100,000 users). These networks can either be operated commercially or by volunteers. Dynamic mixnetworks are peertopeer based networks and every client is also a mix server. The dynamic mix networks have several advantages compared to static mix networks. In theory,there are no limits in the number of users it can support, and since it is a peertopeer system, the barrier to join is low. Entry points (connections between client and first mix) are no longer visible, which makes endtoend traffic analysis attacks more difficult to mount. With these advantages come new difficulties. Dynamic means nodes can join and leave at any time, so the .i,,:. vimous tunnels are less stable and may need to be established frequently. Discovering a node is a problem and some nodes (using dialup) offer poor service, which degrades the quality of service of a tunnel. attacker) becomes expensive. We can also classify the mix network into two types based on the cryptographic alternative used: Decryption Mix Nets [3] and Reencryption Mix Nets. Decryption Mix Nets take cipher texts as input and decrypt them to get back the plain text at the endnode. Reencryption Mix Nets use El Gamal cryptosystem's Malleability property for reencryption. So the cipher text is reencrypted to obtain the original text. 2.4 Realtime Mix Networks On the practical side, several systems have been implemented to provide fast, secure and anonymous communication. These systems differ in terms of infrastructure costs, type of protection provided and the transparency provided to users. 2.4.1 Crowds Crowds [19] was developed by Reiter and Rubin at the ATT Laboratories. It aims to provide a privacy preserving way of accessing the web, without web sites being able to recognize which individuals machine is browsing. Crowds consists of a number of network nodes that are run by the users of the system. Web requests are randomly chained through a number of them before being forwarded to the web server hosting the requested data. The server will see a connection coming from one of the Crowds users, but cannot tell which of them is the original sender. In addition, Crowds uses encryption, so that some protection is provided against attackers who intercept a user's network connection. However, this encryption does not protect against an attacker who cooperates with one of the nodes that the user has selected, since the encryption key is shared between all nodes participating in a connection. Crowds is also vulnerable to passive traffic analysis: since the encrypted messages are forwarded without modification, traffic analysis is trivial if the attacker can observe all network connections. An eavesdropper intercepting only the encrypted messages between the user and the first node in the chain as well as the cleartext messages between the final node and the web server can associate the encrypted data with the plaintext using the data length and the transmission time. 2.4.2 Onion Routing Onion Routing [7, 17, 24, 25] is the most famous of all anonymizing networks. In this system, a user sends encrypted data to a network of socalled Onion Routers (C('!h ,ii, Mixes). A trusted proxy chooses a series of these network nodes and opens a connection by sending a multiply encrypted data structure called an "onion" to the first of them. Each router is a storeandforward device which receives messages of fixed length from different sources, removes one 1v.r of encryption, which reveals parameters such as session keys, and forwards the encrypted remainder of the onion to the next network node. An onion router can store messages for indefinite amount of time waiting for the adequate number of messages, but this is practically not a feasible solution. The onion routers wait for a fixed amount of time, which weakens the protection in presence of low traffic. Once the connection is set up, an application specific proxy forwards HTTP data through the Onion Routing network to a responder proxy which establishes a connection with the web server the user wishes to use. The users proxy multiply encrypts outgoing packets with the session keys it sent out in the setup phase; each node decrypts and forwards the packets, and encrypts and forwards packets that contain the servers response. The network model consists of core onion routers, the endproxy routers and the links between them, through which the routers pass messages of fixed length. The routers form a complete graph among themselves so that every message has equal probability of being forwarded to any of the routers. All the links try to maintain same bandwidth and this is achieved by sending dummy packets to pad the lowbandwidth links. 2.4.3 Babel Babel [8] was designed in the midnineties. Babel offers sender anonymity, called the !.iv, i d Il' ,I I! and receiver anonymity,through replies travelling over the "return l 1i1! The forward part is constructed by the sender of an anonymous message by wrapping a message in Il. ir of encryption. message can also include a return address to be used to route the replies. The system supports bidirectional anonymity by allowing messages to use a forward path, to protect the anonymity of the sender, and for the second half of the journey they are routed by the return address so as to hide the identity of the receiver. While the security of the forward path is as good as in the secured original mix network proposals, the security of the return path is slightly weaker. The integrity of the message cannot be protected, thereby allowing ,. .ii.; attacks, since no information in the reply address, which is effectively the only information available to intermediate nodes, can contain the hash of the message body. The reason for this is that the message is only known to the person replying using the return address. Babel also proposes a system of intermix detours. Messages to be mixed could be I 1' 1.: I,. d" by intermediary mixes, and sent along a random route through the network. It is worth observing that even the sender of the messages, who knows all the symmetric encryption keys used to encode and decode the message, cannot recognize it in the network when this is done. 2.4.4 MixMaster Mixmaster has been an evolving system since 1995 [5, 11]. It is the most widely deploy. .1 and used remailer system. It follows a messagebased approach, namely it supports sending single messages, usually email, though a fully connected mix network. Mixmaster supports only sender .i:,r.. r mity. Messages are made bitwise unlinkable by hybrid RSA and EDE 3DES encryption, while the message size is kept constant by appending random noise at the end of the message. In version two, the integrity of the RSA encrypted header is protected by a hash, making :r;iir.; attacks on the header impossible. In version three the noise to be appended is generated using a secret shared between the remailer, and the sender of the message, included in the header. Since the noise is predictable to the sender, it is possible to include in the header a hash of the whole message therefore protecting the integrity of the header and body of the message. This trick makes replies impossible to construct since the body of the message would not be known to the creator of an anonymous address block to compute in the hash. Beyond the security features, Mixmaster provides quite a few usability features. It allows large messages to be divided in smaller chunks and sent independently through the network. If all the parts end up at a common mix, then reconstruction happens transparently in the network. So large emails can be sent to users without requiring special software. Recognising that building robust remailer networks could be difficult (and indeed the first versions of the Mixmaster server software were notoriously unreliable) it also allowed messages to be sent multiple times, using different paths. It is worth noting that no analysis of the impact of these features on anonymity has ever been performed. 2.4.5 Freedom The Freedom [2] network consists of a set of nodes called Anonymous Internet Proxies (AIPs) which run on top of the existing Internet infrastructure. The user communicates by first selecting a series of nodes (a route), and then using this route to forward IP packets that are stripped of identifying information. This system is secure against denialofservice attacks but is vulnerable to some general traffic analysis attacks such as packet counting attack, wiedie's attack, latency attack and, ( 1... _in. attack. 2.4.6 PipeNet Pipenet was one of the early systems to be implemented. It is a synchronous network implemented on top of an ..inchronous network. Routes are created through the network by choosing the intermediate hops uniformly at random. For providing further anonymity, a certain number of route creation requests are collected by a node, shuffled and then acted upon. The user establishes a shared key with each node on its route as part of the route creation process, using a key negotiation algorithm. The routes are padded end to end for their duration. Endtoend padding means that the originator creates all of the padding and the recipient (or exit node) strips the 1'p 111i, each of the intermediate nodes is unable to distinguish padding from normal traffic, and just processes it as normal. This system provided protection against general traffic analysis but is vulnerable to DenialofService attacks, which are more catastrophic in nature than the normal traffic analysis kind of attacks. 2.4.7 StopAndGo Mixes StopandGo mixes [9] (sgmix) present a mixing strategy, that is not based on batches but d.11 It aims at minimizing the potential for (n 1) attacks, where the attacker inserts a genuine message in a mix along with a flood of his own messages until the mix processes the batch. It is then trivial to observe where the traced message is going. Each packet to be processed by an sgmix contains a d.1 li and a time window. The delay is chosen according to an exponential distribution by the original sender, and the time windows can be calculated given all the d.1 iv. Each sgmix receiving a message, checks that it has been received within the time window, d1 i, the message for the specified amount of time, and then forwards it to the next mix or final recipient. If the message was received outside the specified time window it is discarded. A very important feature of sgmixes is the mathematical analysis of the anonymity they provide. It is observed that each mix can be modeled as a M/\ /oo queue, and a number of messages waiting inside it follow the Poisson distribution. The d. 1 li can therefore be adjusted to provide the necessary anonymity set size. 2.4.8 Tarzan Freedman designed Tarzan [19], a peertopeer network in which every node is a mix. A node initiating the transport of a stream through the network would create an encrypted tunnel to another node, and ask that node to connect the stream to another server. By repeating this process a few times it is possible to have an onion encrypted connection, rl .i, through a sequence of intermediate nodes. An interesting feature of Tarzan is that the network topology is somewhat re stricted. Each node maintains persistent connections with a small set of other nodes, forming a structure called a mimics. Then routes of anonymous messages are selected in such a way that they will go through mimics and between mimics in order to avoid links with insufficient traffic. A weakness of the mimics scheme is that the selection of neighboring nodes is done on the basis of a network identifier or address which, unfortunately, is easy to spoof in realworld networks. 2.5 Summary In this chapter, we have presented in detail different types of mixes based on blending strategies and flushing conditions used. The mixes are divided into simple and pool mixes depending on whether the mix flushes all the messages or not. These two categories are further subdivided into timed and threshold mixes based on the flushing condition being a time interval or a threshold on number of messages. We can also have hybrid mix types, which have both timed or/and threshold properties. We have also described .i,1. ivimous communication systems based on mix networks. Various issues involved in design of mixnetworks are presented. This includes the the most important issue of how much anonymity the network provides and which type of mix is used to assure such anonymity. Finally, we discuss different real time mix systems deploy, 1 such as Crowds, OnionRouting, MixMaster etc. and the functionalities provided in those systems. Different adversary models and attacks on mix networks are presented in next chapter. The next chapter it discusses the anonymity metrics used in practice to measure the level of .,r .ir:vmity provided by a anonymizing system. It also describes the analysis technique used to analyze passive attacks on mixes. CHAPTER 3 ADVERSARY MODELS AND ATTACKS ON MIXES In this chapter, we discuss the various adversary models, followed by different types of attacks. The attacks include active attacks such as timing attacks and denial of service attacks, and passive attacks which are mainly accomplished through traffic analysis. 3.1 Adversary Models The adversary models discussed below are high level descriptions of the attacker's powers and limitations [6]. 3.1.1 Internal and External Adversary An adversary can be a user compromising communication media and network resources (external). An adversary can also be a compromised mix node, sender or a recipient trying to leak information to outsiders (internal). 3.1.2 Active and Passive Adversary An active adversary can arbitrarily modify the messages and computations, cause interruption of service, fabricate new messages, and intercept the messages. Denial of service and loss of data are examples of interruption, spoofing and forging are examples of fabrication and modification. A passive adversary can only listen to the traffic. This is typically done by eavesdropping the network connections by wiretapping, or signal catching in case of wireless transmissions. We can also have a combination of active and passive adversaries. For example, an active external adversary can insert secret messages and a passive internal adversary can correlate the messages coming in a compromised node with messages going out. 3.1.3 Local, Restricted and Global Adversary A global adversary has the ability to see link traffic on every link and control each and ever resource in the network, whereas a local adversary can observe traffic only on certain links in the network. Depending on whether the adversary has complete control 16 over few local links or restricted control over a certain area in the network, he is called a local or a restricted adversary. 3.1.4 Static and Adaptive Adversary A static adversary chooses the tools required before the attack protocol starts and can't change them later in the middle of the attack. Most of the brute force attacks (eg. password crackers) come under this category, since the attacker exhausts all combinations of inputs using an automated tool, which normally is not adaptive. Adaptive adversaries use different tools and resources depending on the response they receive from the previous stage of attack. They can, for example, "follh. messages that are .. d with the original message. 3.2 Attacks on Mixes The attacks described below are high level descriptions of the attacker's schemes and not dependent on any specific implementation[18]. We assume that there are no known implementation weaknesses in the system. The attacker can have any combination of adversary powers discussed in the previous section. In the security literature, the attacks are broadly classified into two main categories active and passive attacks. 3.2.1 Active Attacks An active attack is one in which the intruder may transmit messages, replay old messages, modify messages in transit, or delete selected messages from the wire. A typ ical active attack is one in which an intruder impersonates one end of the conversation, or acts as a maninthemiddle. Active attacks often have ..immetric characteristics in that the attacker's location makes one of the communicating parties more vulnerable. Some of the common active attack schemes used are discussed briefly. Brute Force Attack:. This the simplest and most inefficient of the attacks. Brute force attack is an attack that requires trying all (or a large fraction of all) possible values until the right value is found. In case of mixes, the adversary may want to follow every possible path the message could have taken (passive external adversary). Using this attack, the attacker is able to construct a list of possible recipients for a particular message in most cases. But if the mix or mixnetwork is not designed well, the attacker may be able to establish the senderreceiver correspondence. To illustrate the working of brute force attack, let us consider a mix network with individual nodes as threshold mix with a threshold n. Let us also assume that the message go through exactly d mix nodes. The attacker follows a message from the sender to the first mix node. The attacker then follows each of the n messages being flushed from the first mix node. To do this, the attacker needs to observe n different links, if all the second level mixes are different. The attacker continues this way till the route length is d nodes. At this point, the attacker would have been following nd messages. From these nd message, the attacker now has to choose only those messages that leave the mix network. In the worst case, the attacker can learn the exact receiver from this attack. If the mix is designed for perfect .ilvr:i,,mity, the attacker may end up having nd possibilities. Dummy messages are normally used as the counter measure against brute force attack. Denialofservice attack. A denial of service (DoS) attack is an incident in which a user or organization is deprived of the services of a resource they would normally expect to have. Networkflooding, ~'p .,,i,in port 1i iiiiii,. ii. syn attack (in case of TCP protocol), disk or memory exhaustion are some well known techniques of mounting a DoS attack. By rendering some mixnodes inoperational, the adversary tries to gain information about the routes chosen by the remaining nodes in case of static networks and by certain senders in case of dynamic mix networks. Messagedelaying attack. In this scheme, the attacker can withhold messages until he can obtain enough resources (i.e., links, nodes) or until the network becomes easier to monitor (or to see if the possible recipients receive other messages, etc.). In defense of this attack, the mix nodes should be equipped to verify authenticated timing information. Message' ..i,; attack:. For this type of attack, an active internal adversary with control over the first and last node in a message route is needed. To launch the attack, the attacker can simply tag messages at the first node in such a way that the exit node can spot them. Since the entry node knows the sender and the exit node the recipient, the system is broken. To prevent this attack, measures should be taken to minimize or eliminate the possibility of message I:,i,. Nodeflushing or blending attack. This attack was first mentioned by David ('!i Ch i [21] in his seminal paper. The flushing attack is very effective and can be mounted by an active global adversary. A spamming attack or n1 attack is a very good example for this type of attack. The capabilities of the adversary include delaying (removing) messages, inserting arbitrarily many messages into the system in a short time. The attack is illustrated in case of a simple threshold mix (n). The attacker observes the target message leaving the sender and d.i 1 it. The attacker now sends fabricated messages until the mix fires. As soon as the mix fires, he stops all other messages to the mix and sends the target message along with n 1 of his own messages. After the mix fires, the attacker can easily recognize his n1 messages and therefore determine the destination of the target message. This is an exact attack that is, it provides the adversary with the exact receiver rather than a set of receivers as in case of the brute force attack. Also note that this attack is mix specific and does not depend on the rest of the mixnetwork. Timing attack. In this attack, the adversary uses the fact that different routes can take different amounts of time. Given the set of messages coming into the mixnetwork and the set of outgoing messages, the adversary uses the route time information to establish a correlation between a certain set of incoming and outgoing messages. The attacker doesn't need to carry the expensive brute force or flushing attacks to determine the route taken. If the attacker has access to one of the communicating parties, he might be able to infer which route is taken by simply computing the round trip time (that is, calculating the time it takes to receive a reply). This attack can be prevented by using variable delay mixes, which wait for a random amount of time before firing. This would cause uncertainty in estimating the route lengths if the time taken is very close in magnitude. Wie Die's Attack. In this attack, the attacker wishes to defeat the traffic shaping mechanisms [1] that attempt to hide the real volumes of traffic on an anonymous channel. The attacker creates a route using the link that he wishes to observe, and slowly increases the traffic on it. The router will not know that the stream or streams are all under the control of the attacker, and at some point will signal that the link has reached its maximum capacity. The attacker then subtracts the volume of traffic he was sending from the maximum capacity of the link to estimate the volumes of honest traffic. Disclosure attack. The formal model on which the disclosure attack is based is quite simple. A single mix is used by b participants each round, one of them alvxi being Alice, while the other (b 1) are chosen randomly out of a total number of N 1 possible participants. The threshold of the mix is b so it fires after each of the rounds participants has contributed one message. Alice chooses the recipient of her message to be a random member of a fixed set of m recipients. Each of the other participants sends a message to a recipient chosen uniformly at random out of N potential recipients. We assume that the other senders and Alice choose the recipients of their messages independently from each other. The attacker observes R, ..., Rt the recipient anonymity sets corresponding to t messages sent out by Alice during t different rounds of mixing. The attacker then tries to establish which out of all potential recipients, each of Alices messages was sent to. The original attack as proposed by Kesdogan et al. [9] first tries to identify mutually disjoint sets of recipients from the sequence of recipient anonymity sets corresponding to Alices messages. This operation is the main bottleneck for the attacker since it takes a time that is exponential in the number of messages to be analyzed. 3.2.2 Passive Attacks A passive attack is one in which the intruder attempts to intercept and read data without altering it. Passive monitoring attacks are often symmetric if the attacker can see the traffic from Alice to Bob on a particular link, there's a good chance that he/she can see the traffic in the reverse direction. Communicationpattern attack. By simply looking at the communication patterns (when users send and receive), one can find out much useful information. Communi cating participants normally don't 1 !I:" at the same time, that is, when one party is sending, the other is usually silent. The longer an attacker can observe this type of communication synchronization, the less likely it's just an uncorrelated random pattern. This attack can be mounted by a passive adversary that can monitor entry and exit mix nodes. Law enforcement officials might be quite successful mounting this kind of attack as they often have apriori information: they usually have a hunch that two parties are communicating and just want to confirm their suspicion. Packetcounting attack. These types of attacks are similar to the other passive attacks in that they exploit the fact that some communications are easy to distinguish from others. If a participant sends a nonstandard (i.e., unusual) number of messages, a passive external attacker can spot these messages coming out of the mixnetwork. In fact, unless all users send the same number of messages, this type of attack allows the adversary to gain nontrivial information. The packet counting and communication pattern attacks can be combined to get a message frequency attack (this might require more precise timing information). Communication pattern, packet counting and message frequency attacks are sometimes referred to as traffic shaping attacks and are usually dealt with by imposing rigid structures on user communications. Notice that protocols achieving "network unobservability" are immune to these attacks. Intersection Attack:. An attacker having information about what users are active at any given time can, through repeated observations, determine what users communicate with each other. This attack is based on the observation that users typically communicate with a relatively small number of parties. For example, the typical user usually queries the same web sites in different sessions (his queries aren't random). By performing an operation similar to an intersection on the sets of active users at different times it is probable that the attacker can gain interesting information. Probabilistic or Partial Attack:. Most of the preceding attacks can be carried out partially, that is, the attacker can obtain partial or probabilistic information. For example, he could deduce with probability p that A is communicating with B or A is not communicating with B, C and D. Covert Channels:. Covert channels are discussed in Section 4.2.5. 3.3 Summary In this chapter, we present novel attacks on a mix node or a mixnetwork and the adversary models used to accomplish this attack. The adversary can be an insider or an external observer, an active attacker or a passive eavesdropper, a local attacker or a global adversary who has control over the whole network. The attacks are divided into active and passive attacks. Active attacks involves modification, fabrication, and interception of messages by the attacker. Some well known examples are brute force attack, DenialofService(Dos) attack, and node flushing attack. Passive attack and allows an attacker to compromise anonymity through observing the network traffic for traffic patterns, packet counts, packet sizes etc. Passive attacks are very difficult to detect and may prove to be very harmful. C'i lpter 4 presents the various anonymity metrics and the ,i" 1,; technique being used to analyze various attacks with distinct adversary models. CHAPTER 4 ANONYMITY METRICS AND ANALYSIS TECHNIQUE This chapter describes information theoretic models, proposed in the literature, to quantify the degree of anonymity provided by different systems of mix networks. At first we discuss use of .1,rr. y mity sets as the measure of .riti, vmity and then we go on to analyze the entropy based and route based metrics. Finally, we present anonymity analysis of real time anonymizing systems such as Onion routing and Crowds. 4.1 Anonymity electronic voting. Anonymity can be classified as connection anonymity and data .ii. .ivimity. Data anonymity is about hiding the contents of the packet sent and received in a particular session. Data anonymity is normally achieved by encryption. Connection anonymity is about hiding identities of the source and the destination during the actual information exchange. As discussed in by Reiter and Rubin [19], there are three types of connection anonymity: sender anonymity, receiver anonymity, and unlinkability of sender and receiver. Sender anonymity means that the identity of the party who sent a message is hidden, while its receiver (and the message itself) might not be. Receiver anonymity similarly means that the identity of the receiver is hidden. Unlinkability of sender and receiver means that though the sender and receiver can each be identified as participating in some communication, they cannot be identified as communicating with each other. A second aspect of anonymous communication is the adversary model against which these properties are achieved. The attacker might be an eavesdropper that can observe some or all messages sent and received, collaborations consisting of some senders, receivers, and other parties, or variations of these. Different types of attacks and adversary models have been discussed in Ci Ilpter 3. 23 We cant provide "perfect" privacy since the number of possible senders and recipients is bounded. So, for example, if there are only two parties on the network, an attacker having access to this information can trivially determine who is communicating with whom. The best we can hope for is to make all possible senderrecipient matching look equally likely. That is, the attackers view's statistical distribution should be independent from the actual senderrecipient matching. 4.2 Anonymity Metrics Many real time anonymity systems have been deploy, 1 in past decade, Onion Routers and Crowds being few examples. With each of these systems providing dif ferent level anonymity, there is a definite need to have standard metrics to classify the levels of anonymity provided. Information theory has been proven to be a useful tool to measure the amount of information. This can be used in measuring the information gained by the attacker. Depending on the power of the attacker, and the circumstances we can quantify the anonymity level provided by the system. 4.2.1 Anonymity Sets Traditionally, anonymity sets have been used to measure the anonymity of mix systems. The notion of .:,11 .vimity sets was introduced by C'!i ,ii, for modeling security of DCNet(Dining Cryptographers' Networks)[3]. C!i ii, defines anonymity set as the set of participants who could have sent a particular message, as seen by a global observer who has also compromised a set of nodes[4]. The side of anonymity set is a good indicator of how good the anonymity provided by the system really is. In the best case, the anonymity set is equal to the number of users, which means any user has equal probability of sending the message. In the worst case, the size is one, which means there is no anonymity in the network. 4.2.2 Problems with Anonymity Set Size The attacks against DC networks presented in [4] can only result in partitions of the network in which all the participants are still equally likely to have sent or received a particular message. Therefore the size of the anonymity set is a good metric of the quality of the .,r,,irv:mity offered to the remaining participants. In the stopandgo system [9] definition, the authors realize that different senders may not have been equally likely to have sent a particular message, but choose to ignore it. If different participants accounted in the anonymity set are not equally likely to be the senders or receivers, a designer might be tempted to distribute amongst many participants some possibility that they were the senders or receivers while allowing the real sender or receiver to have an abnormally high probability. The cardinality of the anonymity set is in this case a misleading measure of anonymity. In the standardization attempt, we see that there is an attempt to state, and take into account this fact in the notion of anonymity, yet a formal definition is still lacking. Serjantov and Doi,,. i[2n] discuss this fact in their paper and conclude that it is unwisely ignored in the literature but can give a lot of extra information to the attacker. The Pool Mix. We discuss the case of pool mix to further emphasize the dangers of using sets and their cardinalities to assess and compare anonymity systems. This mix ahii, stores a pool of n messages. When incoming N messages have accumulated in its buffer, it picks n randomly out of the n + N it has, and stores them, forwarding the remaining N in the regular manner. The details about pool mix has been described in section 2.2. There is aliv a small probability that any message that has ever gone into the mix have never left it. Therefore, the sender of every message should be included in the anonymity set. At this point if we consider the anonymity provided by this system in terms of anonymity set size, it would include all the messages gone into the mix. We notice that the anonymity set is independent of the size of the pool, n, which intuitively ~i:' i that the anonymity metric used is inappropriate. Knowledge Vulnerability. Anonymity set metric is also vulnerable against at tacker's has additional knowledge about the system. Consider the arrangement of mixes in Figure 41. The small squares in the diagram represent senders, labeled with their name. The bigger boxes are mixes, with threshold of 2. Some of the receivers are labeled with their sender anonymity sets. Notice that if the attacker somehow establishes the fact that, for instance, A is communicating with R, he can derive the fact that S received a message from E. Figure 41: Vulnerability of Anonymity Sets Indeed, to expose the link E  S, all the attacker needs to know is that one of A, B, C, D is communicating to R. And yet this is in no way reflected in S's sender anonymity set (although E's receiver ...ivir mity set, as expected, contains just R and S). It is also clear that not all senders in this arrangement are equally vulnerable to this, as is the fact that other arrangements of mixes may be less so. Although we have highlighted the attack here by using mixes with threshold of 2, it is clear that the principle can be used in general to cut down the size of the anonymity set. 4.2.3 Entropy Serjantov and Danezis [20] formalized the use of entropy as anonymity metric and extended it to calculate the anonymity in a system of mixes. The principal insight behind the metric(entropy) is that the goal of an attacker is the unique identification of an actor(sender or receiver), while at the same time the goal of the defender is to increase the attackers workload to achieve this. Therefore we chose to define the anonymity provided by a system as the amount of information the attacker is missing to uniquely identify an actors link to an action. The term information is used in a technical sense in the context of Shannons information theory [22]. Therefore we define a probability distribution over all actors ~i, describing the probability they performed a particular action. As one would expect, the sum of these must be one. The sum of these probabilities must alv,v be equal to one. S Pr[a ] 1 As soon as the probability distribution above is known, one can calculate the anonymity provided by the system as a measure of uncertainty that the probability distribution represents. In information theoretic terms this is represented by the en tropy of the discrete probability distribution. Therefore we call the effective .i1. .i:v"mity set size of a system, the entropy of the probability distribution attributing a role to actors given a threat model. It can be calculated as A = [a, = Pr[a] log Pr[ai This metric provides a negative quantity representing the number of bits of information an adversary is missing before they can uniquely identify the target. A similar metric based on information theory was proposed by Diaz et al. [6]. Instead of directly using the entropy as a measure of anonymity, it is normalized by the maximum amount of anonymity that the system could provide. This has the disadvantage that it is more a measure of fulfilled potential than anonymity. An anonymity size of 1 means that one is as anonymous as possible, even though one might not be .,.iz"],vmous at all. The nonnormalized entropy based metric we propose, intuitively provides an indication of the size of the group within which one is hidden. It is also is a good indication of the effort necessary for an adversary to uniquely identify a sender or receiver. 4.2.4 Route Length In the previous section, we have demonstrated that entropy based metrics can give the attacker more information about the system than just anonymity sets. We note that the standard attacks aimed at reducing the size of the anonymity set will now have the effect of narrowing the anonymity probability distribution. If we consider this distribution as a set of pairs (of a sender and its respective nonzero probability of having sent the message), then narrowing the probability distribution is the process of deriving that some senders have zero probability of sending the message and can therefore be safely excluded from the set. As ii 1 in [20], route length is important and some arrangements of mixes are more vulnerable to route length based attacks than others. If the attacker knows the maximum route length allowed by the mix system, then he can eliminate all the routes longer than the maximum length. This reduces the entropy of the anonymity probability distributions without affecting the underlying anonymity set. Hence, the maximum route length should be taken into account when calculating anonymity sets. Several mix systems have been designed to remove the maximum route length constraint, for instance via tunneling in Onion Routing [17] or Hybrid mixes, but it exists in fielded systems such as Mixmaster [5, 11] (maximum route length of 20) and so can be used by the attacker. It may also be possible to obtain relevant information by compromising a mix. Some mix systems will allow a mix to infer the number of mixes a message has already passed through and therefore the maximum number of messages it may go through before reaching the destination. Such information would strengthen our attack, so care needs to be taken to design mix systems (such as Mixmaster [5]) which do not give it away. examples of covert channels, covert channel analysis(CCA) and covert channels arising in mix networks. 4.2.5 Covert C'!i ip. !4 Covert channels can be either innocuous or harmful. Innocuous channels are con sistent with the intent of the systems's security policy. They may result in surprising system behaviors, but do not place the system or the information that it protects at risk. Harmful covert channels are information flows that are contrary to the intent of the system's security policy. Several definitions for covert channels have been proposed in literature, such as the following: Definition 1: A communication channel is covert if it is neither designed nor intended to transfer information at all Definition 2: A covert channel is a mechanism that can be used to transfer information from one user of a system to another using means not intended for this purpose by the system developers. Definition 3: Covert channels v.ll be defined as those channels that are a result of resource allocation policies and resource management implementation." All the above definitions are vague (What is information? what is intent?) and omit any discussion of security. None of the above definitions brings out explicitly the notion that covert channels depend on the type of mandatory access control (e.g., Bell La Padula or Biba model) policy being used and on the policy's implementation within a system design. A new definition using these concepts can be provided that is consistent with the TCSEC definition of covert channels: "A covert channel is a communication channel that allows a process to transfer information in a manner that violates the system's security policy" In any scenario of covert channel exploitation, one must define the synchronization relationship between the sender and the receiver of information. Thus, covert channels is characterized by the synchronization relationship between the sender and the receiver. The purpose of synchronization is for one process to notify the other process it has completed reading or writing a data variable. Therefore, a covert channel may include not only a covert data variable but also two synchronization variables, one for sender receiver synchronization and the other for the receiversender synchronization. Any form of synchronous communication requires both the senderreceiver and receiver sender synchronization either implicitly or explicitly. However, senderreceiver synchronization may still need a synchronization variable to inform the receiver of a bit transfer. A channel that does not include senderreceiver synchronization variables in a system allowing the receiversender transfer of messages is called a quasisynchronous channel. In all patterns of senderreceiver synchronization, synchronization data may be included in the data variable itself at the expense of some bandwidth degradation. Packetformatting bits in ring and Ethernet local area networks are examples of synchronization data sent along with the information being transmitted. Thus, explicit senderreceiver synchronization through a separate variable may be unnecessary. Covert channels are more serious problem in a network system. Network traffic analysis is much more easier than monitoring CPU timing and scheduling process. Network covert channel can be based on either timing or spatial information of the traffic flow pattern. Using spatial information, an eavesdropper observing network traffic can observe the size and destination of the packets to get information. In collaboration of an internal active adversary, the covert channel can be coded by varying the packet size and destination. Using timing information, a covert channel is represented by the frequency and burstiness of the packet generation. The next subsection discusses a particular type of covert channel existing mix networks. 4.2.6 Covert C('Ii ini, I in Mix Networks An insider can use the exitmix server to covertly communicate with an external passive eavesdropper by using the information that the eavesdropper (Eve) can proba bilistically determine if the insider (Alice) sends a message in a particular time interval. This is an example of a onedirectional network covert channel, and was first discovered by N. v ii, i, Moskowitz, Crepeau, and Miller [13]. To illustrate the channel, let us assume that we have a simple exitmix server. Alice, the insider, wants to transfer information covertly to the eavesdropper, Eve. The only action that Eve can take is to count the number of messages per t going from the Mixfirewall to each of receivers, since the messages are indistinguishable. In a perfect noiseless scenario with single receiver, Alice can transmit bits 1 and 0 to Eve by sending a message or not sending a message. Alice can use a predecided encoding to send important information through this channel. The external adversary model can be either global model, which has control over all the links originating from the mix as shown in 43 or a restricted model, which can count the number of messages between two enclaves as shown in Figure 42. 4.2.7 Covert C('!h i,, I Capacity as Anonymity Metric In the covert channel scenario presented in previous subsection, Alice can obviously leak considerable information to Eve. The ability to communicate covertly arises due to a lack of anonymity. If there were "perfect" i:._i yvmity, then we would not expect to find a covert channel [13]. By measuring the amount of covert information that may be leaked through less than perfect anonymity, we can obtain an estimate of .1r:,.ivrmity provided by the system. The mutual information is a good indication of interference between sender and eavesdropper. One way to measure this is by estimating the lower bound of capacity. Shannon's Information Theory [22] is used to calculate the mutual information and the capacity of the channel (which is the maximum value of mutual information). The analysis technique and capacity calculations are presented in Section 4.3. In the initial work [13], it is shown that as system level anonymity increases in the simple mix models (i.e., the number of potential senders increases), the minimum capacity decreases to zero. However, as the probability that a Clueless sender transmits in a given tick increases, the expected number of actual senders in a given time tick also increases, hence the anonymity increases, but the capacity of the covert channel increases once this probability exceeds 0.5. of network design. 4.3 Analysis Technique In this section we would present some scenarios for covert channels arising when using a mix server for different adversary models and network settings. The next subsection discusses the network channel matrix and capacity estimation. 4.3.1 Scenarios There is ahvl one special transmitting node in a network called Alice, which is the malicious. Alice has capabilities of an active internal adversary and can be either static or dynamically adapt to retain the covert channel. Alice and possibly other transmitters(assume N) have legitimate business transmit ting messages to a set of receivers Rili = 1, 2,..., M. These transmitters act completely independently of one another, and have no direct knowledge of each other's recent transmission behavior. Alice may have some general knowledge of the longterm traffic levels produced by the other transmitters, e.g., the number of other transmitters and their probabilistic behavior, which can allow Alice to write a code that can improve the covert communi cation channel's data rate. She cannot, however, perform shortterm adaptation to their behavior. We also assume that there is a clock, and that transmissions only occur in the unit interval of time called a tick. Any subset of transmitters can each either send a single message to a single receiver in a tick, or not send a message at all. Each transmitter in a tick can send to a different receiver, and two or more transmitters may send to the same receiver in the same tick. All messages' contents are encrypted endtoend. Eve ((Enclave 1) Figure 4 2: Restricted Passive Adversary Model There is also an eavesdropper on the network called Eve. Since all transmissions are encrypted, they appear to the eavesdropper Eve as having indistinguishable content. Eve may be either a global passive adversary (GPA), with the ability to see link traffic on every link in the network, or a restricted passive adversary (RPA), with the ability to observe traffic only on certain links. Alice is not allowed any direct communication with Eve. However, Alice can influence what Eve sees on the network. We study network scenarios that attempt to achieve a degree of anonymity with respect to the network communication. That is, the networks are designed with various anonymity devices to prevent Eve from learning who is sending a message to whom. Even if a certain degree of anonymity is achieved, it still may be possible for Alice to communicate covertly with Eve. 4.3.2 Channel Matrix Between Alice and the N clueless senders, there are N + 1 possible senders per t, and there are M + 1 possible actions per sender (since each sender may or may not transmit, and if it does transmit, it transmits to exactly one of the M receivers). Eve R, Alice RM Figure 43: Global Passive Adversary Model We consider Alice to be the input to the quasianonymous channel, which is a proper communications channel [22]. Alice can send to one of the M receivers or not send a message. Thus, we represent the inputs to the quasianonymous channel by the M + 1 input symbols 0, 1,..., M, where i = 0 represents Alice not sending a message, and i E {1,..., M} represents Alice sending a message to the ith receiver Ri. However, note that the i. Ix r" in the quasianonymous channel is Eve. Eve receives the output symbols ej,j = 1,..., K. Eve receives el if no sender sends a message. The quasianonymous channel that we have been describing is a discrete memory less channel (DMC). We define the channel matrix M as an (M + 1) x K matrix, where M[i, j] represents the conditional probability that Eve observes the output symbol ej given that Alice input i. 34 0 1 2 ... j j+1 ... K 0 Po,o Po,i P,2 .. Po,j PO,j+I PO,K 1 Pl,o P1,1 Pl,2 Pj Pj+l Pl,K 2 P2,0 P2,1 P2,2 ... P2,j P2,j+l ... P2,K MM+1,K = : : : : ". : The number i Pi,o Pi,i Pi,2 Pi,j Pi,j+l Pi,K M PM,o PM,1 PM,2 *.. PM,j PM,j+1 *. PM,K of symbols seen by Eve may vary, depending on the adversary model considered. For example, with an RPA observing a link between two mixenclaves, the number of symbols observed by Eve is N + 1. Whereas if a GPA is observing all the links going out a exitmix, the number of possible symbols is much higher and a function of the receivers, M. N + 1 senders can send or not send, at most one message each, out of the private enclave, provided at least one sender does send a message. For example there is only one output symbol observed by Eve for the N+1 v,v that one, and only one sender, can send a message to Ri. We model Alice according to the following distribution each t: P(Alice sends a message to Ri) = xi From the above equation, we get M xo = P(Alice doesn't send a message) 1 xi i=1 We let A represent the distribution for Alice's input behavior, and we denote by E the distribution of the output symbols that Eve receives. Thus, the channel matrix M along with the distribution A totally determine the quasianonymous channel. This is because the elements of M take the distributions Ci into account, and M and A let one determine the distribution E describing the outputs that Eve receives, P(Eve receives ej). Given a discrete random variable X, taking on the values xi, i = ,..., nx, the entropy of X is nX H(X) = p(xi) logp(xi) . i= 1 We use p(xi) as a shorthand notation for P(X = xi). Given two such discrete random variables X and Y we define the conditional entropy (equivocation) to be ny nx H(X Y) = p(Yi) p(xj y) logp(xj yI) . i= 1 j= 1 Given two such random variables we define the mutual information between them to be I(X,Y)= H(X) H(XIY) . Note that H(X) H(XIY) = H(Y) H(YIX), so we see that I(X,Y)= I(Y,X). For a DMC whose transmitter random variable is X, and whose receiver random variable is Y, we define the channel *.' /'. .:1/; [22] to be: C max I(X,Y), x where the maximization is over all possible distribution values p(xi) (that is, the p(xi) are all nonnegative and sum to one). For us, the capacity of the covert channel between Alice and Eve is C= max{H(E) H(EIA)}. where the maximization is over the different possible values that the xi may take (of course, the xi are still constrained to represent a probability distribution). Recall M[i, j] = P(E = ej A = i), where M[i, j] is the entry in the ith row and jth column of the channel matrix, M. 4.4 Summary In this chapter we have defined the objectives of anonymous communication, and the threats against it. We have showed how using anonymity set as metric can lead to wrong results. The pool mix was used as an example to illustrate how .rir ,vmity set showed perfect anonymity, when it was intuitively not possible. We presented entropy as metric measuring .r1.ir :vmity, based on Shannons informa tion theory. This represents how much information an adversary is missing to identify the sender or the receiver of a target message. Using covert channel capacity as a mea sure of anonymity is discussed followed by covert channel Scenarios in Mix Networks. Finally, we present the channel matrix as the tool to estimate the channel capacity. CHAPTER 5 PREVIOUS WORK AND THE EXITMIX MODEL This chapter presents the previous work done (which forms the basis of our work), exitmix firewall model setup and assumptions. It describes the conventions and terminology used, the message distribution probabilities, traffic adversary model and channel matrix in detail. 5.1 Capacity Analysis for Indistinguishable Receivers Case The initial work [13] analyzed the situation where there are two enclaves, commu nication between them is encrypted, and packets are sent only from the first enclave (which contains Alice) to the second (Fig. 42). Eve is able to monitor the commu nication from the first enclave to the second. Anonymity is I !i, i1, d" in that an eavesdropper such as Eve (as RPA) does not !,.i" who is sending a message (that is hidden inside of the first enclave) nor who is receiving the message (this can only be known if one is interior to the second enclave). Eve is only allowed to know how many messages per tick travel from the first enclave to the second. Nonetheless, Alice attempts to communicate covertly with Eve. The input symbols for this channel are 0, which signifies that Alice is not trans mitting a message to any receiver, and 0', which signifies that Alice is transmitting a message to some receiver (keep in mind that Alice is oblivious to the other transmit ters). We break Scenario down into three cases: case 5.1.1, case 5.1.2, and case 5.1.3. Case 5.1.3 is the general form of Scenario and the first two are simplified special cases. 5.1.1 Case 0: Alice Alone This is the case where N = 0. Alice is the only transmitter. Alice sends either 0 (by not sending a message) or 0c (by sending a message). Eve receives either eo = 0 (Alice did nothing) or el = 1 (Alice sent a message to a receiver). The capacity of this noiseless covert channel is 1. Note though the capacity is the maximum, over the probability x for Alice inputting a 0, of the mutual information I(E, A). A is the distribution for Alice described by x, and E is the distribution for Eve. Since there is no noise, I is simply the entropy H(E) describing Eve (which is maximized to 1 when x = .5). I(E, A)= H(E) x log x (1 x) log(1 x). 5.1.2 Case 1: Alice and One Additional Clueless Transmitter In this case N 1= Therefore, Eve receives: 0 if neither Alice nor Clueless transmit; 1 if Alice does not transmit and Clueless does transmit, or Clueless transmits and Alice does not; or 2 if both Alice and Clueless transmit. A anonymizing E network A P 0 0 q a, 1 0C 2 B Figure 51: C('!i i,, I Model for Subsection 5.1.1. A) C('i in,, I block diagram. B) C(! ,i nel transition diagram Figure 51B shows the output symbols corresponding to the three states E might perceive. Let us consider the channel matrix. 0 1 2 012 it' 0 > The 2 x 3 channel matrix i. [i, j] represents the conditional probability of Eve receiving the symbol j when Alice sends the symbol i. It follows that p = a, and thus it trivially follows that q = 3. So our channel matrix simplifies to: 0 1 2 012 0 p q 0 0" 0 p q The probability that Alice sends a 0 is P(A = 0) = x, and therefore P(A = 0c) 1 x. The term x is the only term that can be varied to achieve capacity. Here is where Alice may use knowledge of longterm transmission characteristics of the other transmitters, as well as how many other transmitters there are, to change her (long term) behavior. As with other studies of covert channels [12] we are not concerned with source coding/decoding issues [22]. Our concern is the limits on how well a transmitter can "opt 'ii. its bit rate to a receiver, given that a channel is noisy. The capacity of the covert channel between Alice and Eve is C max{H(E) H(EIA)}. Given the above channel matrix we have: H(E) = {pxlogpx + [qx +p(1 x)] log[.1 + p(l x)] + q(1 x) log q( x)}. 1 2 and H(EIA) p(ai) p(eyj a) logp(ey a) h(p) . i=o j=0 Where h(p) denotes the function plogp (1 p) log(1 p). Thus, (px log px C = max +[qx + p(l x)] log[. p + p(l x)] +q(1 x) log q( x)) h(p) We cannot analytically find the x that maximizes the mutual information, even doing the standard trick of setting the derivative of the mutual information to zero. However, we can plot the capacity as a function of p, and of the x value that maximizes the mutual information as a function of p. 0.75 \ S\Capacity as a function of p 0.5 0 Co 0.25 0 0  0 0.25 0.5 0.75 1 p = P(Alice not sending a message) > Figure 52: Plot of Covert Channel Capacity as a Function of p Figure 52 shows certain symmetries. The capacity graph is symmetric about p = .5, and the graph of the x that achieves capacity is skewsymmetric about p = .5 Consider the two situations where p = c, and where p = 1 c; in both situations 0 < c < .5. Let x, be the probability for the input symbol 0 that achieves capacity in the first situation, and let xl_e be the probability that achieves capacity for the second situation. For the first situation we have that 1x, is the capacity achieving probability for the output symbol 0c, and similarly for the second situation 1 xil, is the capacity achieving probability for the output symbol O0. Physically the two situations are "the same" if we reverse the roles of the outputs symbols 0 and 2. Therefore x, = 1 xl. Writing x, as x, = + A, we see that xl_ = A; this is what the lower dotted plot shows in Figure 52 (e = 1/2 == A 0). Observation 1 In conditions of very little extra traffic, or very high extra traffic, the covert channel from Alice to Eve has higher ,'p', .:,'; Observation 2 The 'pr. .:/;/ C(p), as a function of p is strictly bounded below by C(.5), and C(.5) is achieved when the mutual information is evaluated at x = .5. It is obvious that very little extra traffic corresponds to very little noise. At first glance though, it seems counterintuitive that heavy traffic also corresponds to a small amount of noise. This is because the high traffic is used as a baseline against which to signal. This is analogous to transmission of bits over a channel where the bit error rate (BER) Pg is greater than 1/2. In this case, the capacity of the channel is the same as that of a channel with BER of 1 Pe, by first inverting all the bits. It is the inbetween situations that negatively affect the signaling ability of Alice. But, even in the noisiest case (i.e., where p = .5) Alice can still transmit with a capacity of a half bit per tick. Note that we can never guarantee errorfree transmission, no matter how we group the output symbols. In fact, it is possible that the outputs will alv, be the symbol 1 (of course the probability of this quickly approaches zero, as the number of transmissions goes up). So this covert channel has a zeroerror .p', t. .:1,' [23] of zero. Capacity is a useful measure of a communication channel if the assumption is that the transmitter can transmit a large number of times. With a large number of transmissions, an errorcorrecting code can be utilized so as to achieve a rate close to capacity. If the transmitter only transmits a small number of transmissions, then using the capacity alone can be misleading. 5.1.3 Case 2: Alice and N Additional Transmitters we imagine that there are N + 1 transmitters, Alice is one of them, and the other N are all independently identical clueless transmitters. That is, there are transmitters Cluelessl, Clueless2, ..., CluelessN. Again, Eve can only see how many messages are leaving the first MIXfirewall headed for the second MIXfirewall. Therefore Eve can determine if there are 0, 1,... N + 1 messages leaving the firewall. That is all Eve can determine. Therefore, there are still the two input symbols ao = 0 and al 0 but we have N + 2 output symbols. The probability that Cluelessi does not send a message is still p, and that it does send a message is q = 1 p. Now, calculate the channel matrix. Keep in mind that Alice acts independently of the Cluelessi. Alice sends a 0 For Eve to receive ek (that is E = k), 0 < k < N we need k of the clueless transmitters to send a message, and N k not to send a message. Therefore, p(ekA = 0) (N)pNkqk, O0 p(eN+1A =0) 0. Alice sends a 0O p(eoA = 0c) = 0, since the event never happens. For Eve to receive ek (that is E = k), 1 < k < N + 1 we need k 1 of the clueless transmitters to send a message, and N k + 1 not to send a message. p(eklA =0) ( ) 1 PNk+lqk1, 1 < k < N+ 1. The channel matrix .3 v is n 1 9 VT 0 (pN NpN q ()pN 2 ... qN 0c 0 pN NpN q ... NpqN1 B Figure 53: ('! iCi,! I for Case 3, the general case of N clueless tion diagram. B) ('C1 ,i,, I Matrix N+1 0 qN users. A) ('!i i,.i I transi We obtain the following results from the analysis. The full details and proofs are in [13]. In conditions of very little extra traffic, or very high extra traffic, the covert channel from Alice to Eve has higher capacity. The capacity C(p), as a function of p is strictly bounded below by C(.5), and C(.5) is achieved when the mutual information is evaluated at x = .5 (of course p = .5 also in this situation). The capacity C(p), as a function of p is strictly bounded below by a function that decreases monotonically to zero as the number of transmitters increases, but is never zero. The bias in the code used by Alice to achieve the optimum data rate on the channel is not alv x = 0.5, but it is never far from 0.5, and our preliminary experimental results indicate that the difference in capacity is minor. This last observation agrees with [10], which presents the general result that in DMCs, mutual information bit rates obtained by using x = .5 is no less than 94.21. of the channel capacity. Even if Alice has no knowledge of the probabilistic behavior of the other transmitters, her data rate will not be too far from optimal if she uses an unbiased code. 5.2 ExitMix Model 5.2.1 Scenario There are N + 1 senders in a private enclave. Messages pass one way from the private enclave to a set of M receivers. The private enclave is behind a firewall which also functions as a timed Mix [21] that fires every tick, t, hence we call it a simple timed Mixfirewall. For the sake of simplicity we will refer to a simple timed Mix firewall as a Mixfirewall in this paper. One of the N + 1 senders, called Alice, is malicious. The other N clueless senders, Clueless, i = 1,..., N, are benign. Each sender may send at most one message per unit time t to the set of receivers. All messages from the private enclave to the set of receivers pass through public lines that are subject to eavesdropping by an eavesdropper called Eve. The only action that Eve can take is to count the number of messages per t going from the Mixfirewall to each receiver, since the messages are otherwise indistinguishable. Eve knows that there are N + 1 possible senders. The N clueless senders act in an independent and identical manner (i.i.d.) according to a fixed distribution C, i = 1,..., N. Alice, by sending or not sending a message each t to at most one receiver, affects Eve's message counts. This is how Alice covertly communicates with Eve via a quasianonymous channel [14]. Eve Clueless R Clueless2 IR2 Alice Mixfirewall Clueless RM Cl UeicSSN RM Figure 54: Exit Mixfirewall Model with N Clueless Senders and M Distinguishable Receivers Alice acts independently (through ignorance of the clueless senders) when deciding to send a message; we call this the ignorance assumption. Alice has the same distribu tion each t. Between Alice and the N clueless senders, there are N + 1 possible senders per t, and there are M + 1 possible actions per sender (each sender may or may not transmit, and if it does transmit, it transmits to exactly one of M receivers). We consider Alice to be the input to the quasianonymous channel, which is a proper communications channel [22]. Alice can send to one of the M receivers or not send a message. Thus, we represent the inputs to the quasianonymous channel by the M + 1 input symbols 0, 1,..., M, where i = 0 represents Alice not sending a message, and i E {1,..., M} represents Alice sending a message to the ith receiver Ri. The i I :, in the quasianonymous channel is Eve. Eve receives the output symbols ej,j = 1,..., K. Eve receives el if no sender sends a message. The other output symbols correspond to all the different vi the N + 1 senders can send or not send, at most one message each, out of the private enclave, provided at least one sender does send a message. 5.2.2 Channel Matrix Probabilities For the sake of simplicity we introduce a dummy receiver Ro (not shown above). If a sender does not send a message we consider that to be a in.  ,." to Ro. For N + 1 senders and M receivers, the output symbol ej observed by Eve is an M + 1 vector (a al, ...., a ), where aj is how many messages the Mixfirewall sends to Ri. Of course it follows that 0 aj N + 1. The quasianonymous channel that we have been describing is a discrete memory less channel (DMC). We define the channel matrix M as an (M + 1) x K matrix, where M[i, j] represents the conditional probability that Eve observes the output symbol ej given that Alice input i. We model the clueless senders according to the i.i.d. Ci for each period of possible action t: P(Cluelessi doesn't send a message) = p q 1p P(Cluelessi sends a message to any receiver) M M where in keeping with previous papers, q = 1 p is the probability that Cluelessi sends a message to any one of the M receivers. When Cluelessi does send a message, the destination is uniformly distributed over the receivers R1,..., RM. We call this the semiuniformity assumption. Again, keep in mind that each clueless sender has the same distribution each t, but they all act independently of each other. 5.3 Capacity Analysis for ExitMIX Scenario This chapter presents the capacity analysis for different cases of transmitters and receivers. Each case is discussed in detail and capacity estimated is compared among the cases. The mathematics involved in capacity estimation for this scenario is very compli cated. Hence, we estimate the capacity for simple cases and then try to generalize our observations for N senders and M receivers. To distinguish the various channel matrices, we will adopt the notation that MN.M is the channel matrix for N clueless senders and M receivers. 5.3.1 One Receiver (M = 1) Case 1: No Clueless Senders and One Receiver (N = 0, M 1= ). Alice is the only sender, and there is only one receiver R1. Alice sends either 0 (by not sending a message) or 1 (by sending a message). Eve receives either e = (1, 0) (Alice did nothing) or e2 = (0, 1) (Alice sent a message to the receiver). Since there is no noise (there are no clueless senders) the channel matrix Mo.1 is the 2x2 identity matrix and it trivially follows that P(E = el) = xo, and that P(E = e2) x. ei 62 0 0 Mo.1i = 1 0 1 Since x0 = 1 xi, we see that1 H(E) = xologxo (1 xo) log(1 xo). The channel matrix is an identity matrix, so the conditional probability distribution P(EIA) is made up of zeroes and ones, therefore H(EIA) is identically zero. Hence, the capacity is the maximum over x0 of H(E), which is easily seen to be unity2 (and occurs when xo = 1/2). Of course, we could have obtained this capacity3 without appealing to mutual information since we can noiselessly send one bit per tick, but we wish to study the nontrivial cases and use this as a starting point. Case 2: N Clueless Senders and One Receiver (M 1). This case reduces to the indistinguishable receivers case with N senders i &.i. .1 in [13] with both an exit Mixfirewall that we have been discussing and an entry Mixfirewall (with the receivers behind the latter). Alice can either send or not send a message, so the input alphabet again has two symbols. Eve observes N + 2 possible output symbols. That is, Eve sees el (N + 1,0), e2 (N 1), e3 = (N 1,2), eN+ (0, N + 1). A detailed discussion of this case can be found in [13]. 5.3.2 Some Special Cases for Two Receivers (M = 2) There are two possible receivers. Alice can signal Eve with an alphabet of three symbols: 1 or 2, if Alice transmits to R1 or R2, respectively, or the symbol 0 for not sending a message. Let us analyze the channel matrices and the entropies for different cases of senders. 1 All logarithms are base 2. 2 The units of capacity are bits per tick t, but we will take the units as being under stood for the rest of the report. Recall that all symbols take one t to pass through the channel. 3 This uses Shannon's [22] ..i! I,.l ic definition of capacity, which is equivalent for noiseless channels (in units of bits per symbol). The symbol ej that Eve receives is an 3tuple of the form (a], ai, a'), where at is the number of messages received by ith receiver.4 As before, the index i = 0 relates to Alice not sending any message. The elements of the 3tuple must sum to the total number of senders, N + 1, 2 at N+1. ai=N+ t i=0 Case 3: No Clueless Senders and Two Receivers (N = 0, M = 2). Alice is the only sender and can send messages to two possible receivers. The channel matrix is trivial and there is no anonymity in the channel. (1,0,0) (0,1,0) (0,0,1) 0 1 0 0 Mo.2 = 1 0 1 0 2 0 0 1 The subscript 0.2 represents one sender (Alice alone) and two receivers. The 3 x 3 channel matrix Mo.2 [i, ] represents the conditional probability of Eve receiving the symbol ej, when Alice sends to the receiver Ri (A = i). '0' stands for not sending a message. The mutual information I is given by the entropy H(E) describing Eve I(E,A) = H(E) = x logx log x21(1 xl x 2) log( xl x2). The capacity of this noiseless covert channel is log 3 t 1.58 (at xi=1/3, i = 0, 1, 2). For M = 2 this is the largest capacity, which we note corresponds to zero anonymity. Of course, this is not surprising since there are no clueless senders. Case 4: N = 1 Clueless Sender and M = 2 Receivers. The following row vector describes the probabilities of the possible output symbols when only one clueless sender is involved. 4 Recall that the at's of the output symbol are not directly related to A, which de notes the distribution of Alice. Eve Clueless1 ,, Mixfirewall Ali:ce  Figure 55: Case 4: with N =1 C I : i .Sender and M = 2 !ceivers (1,0,0) (0,1,0) (0,0,1) ( q/2 q/2 j The messageset matrix given below shows how the various output symbols can be formed. The rows correspond to Alice's actions, and the columns, correspond to the actions of Clueless. Row and column labels are added elementwise to form the matrix entry, which is the output symbol corresponding to the channel state. (1,0,0) (0,1,0) (0,0,1) (1, 0, 0) (2, 0, 0) (1,1,0) (1,0, 1) (0, 1,0) (1, ,0) (0,2,0) (0, ) (0,0, 1) ( 0, 1) (0, 1, 1) (0, 0, 2) The set of distinct symbols formed in the matrix cells constitutes the set of output symbols Eve may receive. In this case, there are three repetitions in the messageset matrix, so Eve may receive 9 3 6 symbols. Let us consider the channel matrix. (2,0,0) (1,1,0) (1,0,1) (0,2,0) (0,1,1) (0,0,2) 0 p q/2 q/2 0 0 0 M1.2 1 0 p 0 q/2 q/2 0 2 0 0 p 0 q/2 q/2 The 3 x 6 channel matrix M1.2[i, j] represents the conditional probability of Eve receiving the symbol ej when Alice sends to Ri. As noted, the dummy receiver Ro 1.6! 1.4 \ 12 S0.8 0.6 0.4 0.2 0 0  0 0.2 0.4 0.6 0.8 1 q> 1 figure 5 6: Capacity for N 1 ( :. i Sender and :' 2 Receivers corresponds to Alice not sending to any receiver (however this is still a transmission to Eve via the quasianonymous channel). Given the above channel matrix we have: H(E) {pxo log[pxo] +[qxo/2 + pxi] log [, ,,/2 + pxl] +[qxo/2 + px2] log[. ,,/2 + px2 +[qxl/2] log [, 1/2] + [qxl/2 + qx2/2] log [, 1/2 + qx2/2] +[qx2/2] log[., _/2]}. The conditional entropy is given by 2 6 H(EA) = , p(xi) e xi)log p(c xi) = 2(p) , i=0 j =1 where h2(p) denotes the function h2(p) (1 p)/21og(( p)/2) (1 p)/2log(( p)/2) plogp = (1 p)log((1 p)/2) plogp . The mutual information between Alice and Eve is given by Eve Cluelessl Ri Alice Mixfirewall Clueless2 V R2 Figure 57: Case 5: System with N = 2 Clueless Senders and M = 2 Receivers I(A,E)= H(E) H(EIA) , and the channel capacity is given by C maxI(A,E) A = max {pxo 1. [/"1,] X1,2 +[qxo/2+pxi] log [.,,,/2+pxl] +[qxo/2+px2] log [,,,,/2+px2] +[qxl/2] log [., 1/2]+[qxl/2+qx2/2] log[.,, 1/2+qx2/2] +[qx2/2] log [,2/2]}h2(p). Note that the maximization is over xl and x2, since x0 is determined by these two probabilities (holds for any N). This equation is very difficult to solve analytically and requires numerical techniques. Figure 56 shows the capacity for this case with the curve N 1= From the plot the minimum capacity is approximately 0.92, when p = 1/3. This is less than 1.58, which is the corresponding value for N = 0 case. We will come back to this curve later for comparison purposes with other values of N. Case 5: N = 2 Clueless Senders and M = 2 Receivers. The row vector describing the output symbols and their probabilities with only the two clueless senders only is given by (2,0,0) (1,1,0) (1,0,1) (0,2,0) (0,1,1) (0,0,2) ( 2 p ppq q2/4 q2/2 q2/4 The symbol (2, 0, 0) has probability p2 because both clueless do not send a message. The symbol (1, 1, 0) has probability 2p(q/2) because either Cluelessl does not send a message and Clueless2 sends a message to R1 or visa versa. The other values behave similarly. The message set matrix, which has the contributions from the clueless as the column index and the contributions from Alice as the row index, is as follows. (2,0,0) (1,1,0) (1,0,1) (0,2,0) (0,1,1) (0,0,2) (1,0,0) (3,0,0) (2, 1,0) (2,0, 1) (1,2,0) (1,1,1) (1,0,2) (0,1,0) (2,1,0) (1,2,0) (1,1,1) (0,3,0) (0,2,1) (0,1,2) (0, 0, 1) (2,0, 1) (1,1,1) (1,0,2) (0,2, 1) (0, 1,2) (0, 0,3) By inspection of the matrix, we notice that the output symbols with more rep etitions will have higher probability of being seen by Eve, when compared to others. That is, output symbol (1, 1, 1) will have a greater probability of being observed than (3, 0, 0) or (0, 3, 0).The probability of observing a symbol also depends on the proba bility distribution of the transmitter over the receivers (i.e., the value of q). There are eight repetitions in the messageset matrix, so the number of total possible symbols Eve may receive 18 8 = 10 symbols. The channel matrix M_ _. is given below. (3,0,0) (2, 1,0) (2,0, 1) (1,2,0) (1,1, 1) (1,0,2) (0,1,2) 0,3,0) (0,2, 1) 0,0,3) 0 p2 pq pq q2/4 q2/2 q2/4 0 0 0 0 M2.2 =1 0 p2 0 pq pq 0 q2/4 q2/4 q2/2 0 2 0 0 p2 0 pq pq q2/2 0 q2/4 q2/4 The 3 x 10 channel matrix 3 _ [i, j] represents the conditional probability of Eve receiving ej when Alice sends a message to receiver Ri. Figure 58 shows the capacity for this case N = 2. Again, the minimum capacity is found at p = 1/3 = 1/(M + 1). From the plot the minimum capacity is approximately 0.62, when p 1/3. 5.3.3 Some Special Cases for Three Receivers (M = 3) Case 6: N = 1 Clueless Senders and M = 3 Receivers. Alice or Clueless can send to three possible receivers or refrain from sending (denoted by '0'). The probabilities of 0.8 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 q > Figure 58: C. : Itv for N 2 clueless senders and M Eve Cluelessl Alice Figure 59: Case 6: S : with N 1 C i : Senders and .l = 3 Receivers the various output symbols from the one clueless sender are given below. (1,0,0,0) (0, 1,0,0) p q/3 (0,0, 1,0) q/3 (0,0,0,1) q/3 Now let us examine the number of possible message set symbols obtained if we merge the individual message sets of Alice and Clueless. 2 receivers R1 R2 R3 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 q > Figure 510: C : y N S1 clueless sender and = 3 receivers (1,0,0,0) (0,1,0,0) (0,0,1,0) (1,0,0,0) (2,0,0,0) (1,1,0,0) (1,0,1,0) (0, 1,0,0) (1,1,0,0) (0,2,0,0) (0,1,1,0) (0,0, ,0) (1,0,1,0) (0, 0) (0, 0,02,0) (0,0,0, 1) (1,0,0, 1) (0, 1,0, 1) (0,0, 1, 1) As we can see from the above messagematrix, there are message sets formed, so Eve may receive 10 different symbols. The channel matrix M1.3is given below. (2,0,0,0) (1,1,0,0) (1,0,1,0) (1,0,0,1) (0,2,0,0) (0,1,1,0) (0,1,0,1) 0 p q/3 q/3 q/3 0 0 0 1 0 p 0 0 q/3 q/3 q/3 2 0 0 p 0 0 q/3 0 3 0 0 0 p 0 0 q/3 (0,0,0,1) (1,0,0, 1) (0, 1,0, 1) (0,0, 1, 1) (0,0,0,2) six repetitions in the ( ,0,,2,0) ( ,0,,1,1) (0,0,0,2) 0 0 0 0 0 0 q/3 q/3 0 0 q/3 q/3 The 4 x 10 channel matrix M1.3[i, j represents the conditional probability of Eve receiving ej when Alice sends a message to receiver Ri. Figure 510 shows the capacity for this case of N 1= The minimum capacity is found at p = 1/4 = 1/(M + 1). From the plot the minimum capacity is approximately 1.25, when p 1/4. Case 7: N = 2 Clueless Senders and M = 3 Receivers. The row vector describing how the clueless users influence the output symbols is given below. (2,0,0,0) (1, 1,0,0) (1,0,1,0) (1,0,0,1) (0,2,0,0) (0,1,1,0) (0,1,0, 1) (0,0,2,0) (0,0, 1, 1) (0,0,0,2) ( p2 2pq/3 2pq/3 2pq/3 q2 /9 2q/9 2q2/9 q2/9 2q/9 q2/9 ) Now let us examine the size of the set of output symbols obtained if we merge the individual message sets of Alice and the two clueless senders: (2,0,0,0) (1, 1,0,0) (1,0,1,0) (1,0,0, 1) (0,2,0,0) (0, 1,1,0) (0,1,0, 1) (0,0,2,0) (0,0,1,1) (0,0,0,2) (1,0,0,0) (3,0,0,0) (2, 1,0,0) (2,0, 1,0) (2,0,0, 1) (1,2,0,0) (1, 1, 1,0) (1, 1,0, 1) (1,0,2,0) (1,0, 1, 1) (1,0,0,2) (0, 1,0,0) (2, 1,0,0) (1,2,0,0) (1, 1, 1,0) (1, 1,0, 1) (0,3,0,0) (0,2, 1,0) (0,2,0, 1) (0, 1,2,0) (0, 1, 1, 1) (0, 1,0,2) (0,0, 1,0) (2,0, 1,0) (1, 1, 1,0) (1,0,2,0) (1,0, 1, 1) (0,2, 1,0) (0, 1,2,0) (0, 1, 1, 1) (0,0,3,0) (0,0,2, 1) (0,0, 1,2) (0,0,0, 1) (2,0,0, 1) (1, 1,0, 1) (1,0, 1, 1) (1,0,0,2) (0,2,0, 1) (0, 1, 1, 1) (0, 1,0,2) (0,0,2, 1) (0,0, 1,2) (0,0,0,3) As we can see, there are 20 repetitions in the symbols formed. Hence, the total symbols seen by Eve become = 40 20 = 20 symbols. If we look through the columns (1, 1, 0, 0), (0, 1, 1, 0) and (1,0, 1, 0), we can find the element (1, 1, 1, 0) common to all the three columns. There are two more similar cases for a common element in three columns. From this, we conclude that the message sets with even distribution of messages seem to have a single element common to many of the them, whereas those with skewed distribution seem to be unique. This is expected, as the vv to distribute over several receivers is multiple, while there is only one way for all senders to send to the same receiver. The channel matrix (split into two) is given below. (3,0,0,0) (2,1,0,0) (2,0,1,0) (2,0,0,1) (1,2,0,0) (1,0,2,0) (1,0,0,2) (1,1,1,0) (1,1,0,1) (1,0,1,1) 0 p2 2pq/3 2pq/3 2pq/3 q2 /9 q2/9 q2/9 2q2/9 2q2/9 2q2/9 1 0 p2 0 0 2pq/3 0 0 2pq/3 2pq/3 0 2 0 0 p2 0 0 2pq/3 0 2pq/3 0 2pq/3 3 0 0 0 2 0 0 2pq/3 0 2pq/3 2pq/3 1.8 \ 1.6 1.4 1.2 S1 0.8 0.6 0.4 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 q > Figure 511: C.: : vy for N = 2 clueless senders and M = 3 receivers (0,3,0,0) (0,2,1,0) (0,2,0,1) (0,1,2,0) (0, 1, 0,2) (, 1,1,1) (0,0,3,0) (0,0,2,1) (0,0, 1,2) (0,0,0,3) 0 0 0 0 0 0 0 0 0 0 0 1 q2/9 2q2/9 2q2/9 q2/9 q2/9 2q2/9 0 0 0 0 2 0 q2/9 0 2q2/9 0 2q2/9 q2/9 2q2/9 q2/9 0 3 0 0 q2/9 0 2q2/9 2q2/9 0 q2/9 2q2/9 q2/9 The 4 x 20 channel matrix _11 ;[i, j] represents the conditional probability of Eve receiving ej when Alice sends a message to receiver Ri. The generalized formula for the matrix elements is given by 2 p(a 71)q/3)3aJ for a 1,2,3 m(0,j) { (a~l)'^'^ u 0 for a = 0 0 for a = 0 ,2 pa 27(q/3)2a for a = 1,2,3 m(2, j) L ( 0 for a 0 Eve Cluelessl P Ri Alice MIXfirewall R2 Clueless2 R3 Figure 512: Case 7: System With N = 2 Clueless Senders and M = 3 Receivers Eve R71 Clueless R2 MIXfirewall Rj Alice RM Figure 513: Case 8: System with N = 1 Clueless Sender and M Receivers S(q/3 2a3 for a = 1, 2,3 n(3,j) =f a 3.a3!a3!(a3 1)1 0 for a 0 Figure 511 shows the capacity for this case in the curve when N = 2. The minimum capacity is found at p = 1/4 = 1/(M + 1). From the plot the minimum capacity is approximately 0.89, when p = 1/4, which is less than the lowest capacity for the N = 1 case. 5.3.4 Some Generalized Cases of N and M Case 8: N = 1 Clueless and M Receivers. We generalize the scenario to one clueless transmitter and M receivers. The probability describing the actions of only the one clueless sender is given below. (1,0,0,0 0) 0,1, 0,0 .0) 0, 1,0, .. 0) (0,0,0,1 ... 0) ... 0,0,0,0 ..., 1) ( p ( if q/M q/M ... q/M The message set matrix is given below. (1,0,0,0, .. 0) (0, (0,0,1,0, 0) (0,0,0, 1, 0) ( 1) (1,0,0,0, ...,0) (2,0,0,0, .. 0) (1,1,0,0,. 0) (1,0,1,0, .,0) (1,0,0,1,.. 0) ... (1,0,0,0 .. 1) (0,1,0 0, 0, 0) (1,1,0,0, 0) (0,2,0,0,. 0) (0,1,1,0, ,0) (0,1,0,1,.. 0) ... (0,1,0,0 ... 1) (0,0,1,0,. ,0) (1,0,1,0, 0) (0,1,1,0, 0) (0,0,2,0, ,0) (0,0,1,1,.. 0) ... (0,0,1,0 .. ,1) (0,0,0,1,. ,0) (1,0,0,1, 0) (0,1,0,1, 0) (0,0,1,1, ,0) (0,0,0,2,.. 0) ... (0,0,0,1 1) (0,0,0,0, ) (1,0,0,0 .. 1) (0,1,0,0, ... ) (0,0,1,0,..,1) (0,0,0,1 ... 1) ... (0,0,0,0, 2) The number of output symbols that may be seen by Eve is identical to the total possible distinct pairs in the messageset matrix shown above. There are two indistin guishable transmissions (including null transmissions) and they are sent into M + 1 distinct receivers (urns) (this also includes the null transmission, which by convention goes to Ro, not shown in the figure). Combinatorics tells us then that there are (M2+) distinct combinations (symbols) that Eve may receive. The channel matrix is given below. (2,0,0,0, ,0) 1,1,0,0, ,0) (1,0,1,0, ) ... 1,0,0,0 ,1) 0,2,0,0 .. 0) ... 0, 0,0, 0,... 2) 0 p q/M q/M ... q/M 0 ... 0 1 0 p 0 ... 0 q/M ... 0 2 0 0 p ... 0 0 ... 0 3 0 0 0 ... 0 0 ... 0 M 0 0 0 ... p 0 ... q/M The (M+1) x (M+2) channel matrix Ml.M[i,j] represents the conditional probability of Eve receiving ej when Alice sends a message to receiver Ri. The probability distribution among the elements of the channel matrix can be calculated by the formula below. pf o (q/M)N : a jO Vi 1,2,3,... ,M and j 1,2,3,. (M2) 0 ai 0 p(ai ) (q/M)Na+1 : aj / 0 Vj 0,1,2, (M+2) S0mo = 0 : a' = 0 The conclusions and more generalizations related to this case are discussed in the results section. Case 9: N Clueless Senders and M = 2 Receivers. In this case, we generalize the problem to N clueless transmitters for the two receivers case. The total number of message set symbols seen by Eve, if only the clueless are transmitting, can be calculated as the number of combinations in which N transmitters can send (or not send) a message times the number of combinations in which the messages sent can be distributed into two receivers. If k out of N transmitters send a message, then the k messages sent can be divided into two receivers in k + 1 possible combinations ((k, 0), (k 1, 1),..., (0, k)). message set size = 1 + 2 + 3 + 4 + + (N + 2) N+2 i=0 (N + 2)(N + 3)/2 The probability of each channel state with clueless only is as follows. (N, 0,0) N 1,1,0) N 1,0,1) (N 2,2, 0) N 2, 1,1) (N 2,0,2) ... (0,0, N) ( p NpN1 p2 NpN1q/2 N(N 1)pq2/8 N(N )p2 q2 /4 N(N )p2/8 ... (q/2)N) Now let us merge the individual message sets of Alice and the N clueless transmit ters to determine the number of symbols received by Eve. (N, 0, 0) (N1,1,0) (N1,0,1) (N 2,2,) (N 2,1,1) (N 2,0,2) ... (, 0, N) 1, 0, 0) (N+1, 0, ) N, 1,0) (N,0, 1) N 1,2,0) N 1,1,1) N 1,0,2) ... (1,0, N) (0,1,0) (N, 1, ) (N 1,2,) (N 1,1,1) (N2,3,0) (N2,2,1) (N2,1,2) ... (01,, N) ( 0, 1) (N, 0,1) (N 1,1,1) (N1,0,2) (N2,2,1) (N2,1,2) (N2,0,3) ... (0,0, N+1) As observed before, the message set (N/3 + 1, N/3, N/3) is the most uniform message distribution. Hence, it has maximum number of repetitions in the message set matrix and will have a greater probability of being observed than (N + 1,0, 0) or (0, 1, N) The channel matrix MN,2 is given below. N + 1, 0, 0) (N, 1,0) N, 0, 1) N 1, 2, 0) N 1,1,1) N 1, 0, 2) ... 0,0, N + 1) 0 pN NpNlq/2 NpNlq/2 N(Nl)pN 2q2/8 N(Nl)pN 2q2/4 N(N )pN 2q2/8 ... 0 1 0 p" 0 NpNlq/2 NpNlq/2 0 ... 2 0 0 p" 0 NpNlq/2 NpNq/2 , Cluelessl Clueless, R Alice MIXfirewall Clueltess Sigure 5 14: Case 9: System with N ('::. Senders and / 2 Receivers The 3 x ((N + 2)(N + 3)/2) channel matrix MN.2[i,jl represents the conditional probability of Eve receiving ej when Alice sends a message to receiver Ri. The probability distribution in the channel matrix can be imagined as nesting of two binomial distributions: First, between messages sent and received; second, the distribution of messages sent to the two receivers. So, given the vector (ai, a\, aj), the element of the channel matrix can be generalized by the formula below. moj N= N 1)p(a1)(prob. distribution of (N (a 1)) messages to RI and R2) S(a ( (O ))(q/2) (q/2) (aN )p(a 1)N (a 1) 2) t ) p1 (q/.(/)10) a ) ( N 1J t) m2j Ip a31o (q/2)N Note that aj does not explicitly appear but is implicitly in the above since (a' + a' + a) 1 = N, this relationship will be seen to be important in the following general case (where we use a generalized combinatorial formula). The conclusions and more generalizations related to this case are discussed in the results section. Case 10: N Clueless Senders and M Receivers. We now generalize the problem to N clueless senders and M receivers (refer again to Figure 54). There are N + 1 indistinguishable transmissions (including null transmissions) and they are sent into M + 1 distinct receivers (urns) (this also includes the null transmission, which by convention goes to Ro, not shown in the figure). Combinatorics tells us then that there are K = (N+M+) possible symbols e,. The rows of our channel matrix correspond to the actions of Alice. The ith row of MN.M describes the conditional probabilities p(ejlxi) (For simplicity we will not ahbi. explicitly note that j = 1,..., (N+M+1).) By convention el alib corresponds to every sender not sending a message (which is equivalent to all senders sending to Ro). Therefore el is the M + 1 tuple (N + 1, 0,..., 0). Given our simplifying semiuniformity assumption for the clueless senders' distribution, this term must be handled differently. The first row of the channel matrix is made up of the terms MN.M[0,j]. Here, Alice is not sending any message (i.e., she is sendingg to Ro), so Alice contributes one to the term aj in the M + 1 tuple (aj, a a, ..., aj)} associated with ej. In fact, this tuple is the "long hand" representation of ej. Therefore the contributions to the M + 1 tuple (aj 1, a{, aj,..., aM) describe what the N clueless senders are doing. That is, a' 1 clueless senders are not sending a message, a\ clueless senders are sending to R1, etc. Hence, the multinomial coefficient (_, ... ) tells us how many v i this may occur.5 For each such occurrence we see that the transmissions to Ro affect the probability by pa 1, and the transmissions to Ri, i > 0, due to the semiuniformity assumption, contribute (q/M)ai. Since the actions are independent, the probabilities multiply, and since aj 1 + a{ + .. + aM = N, we have a probability term of pa31(q/M)N+lao. Multiplying that term by the total number of v,v of arriving at that arrangement we have that: MN.M[O,j] ( ,., 1(q/M)N+laS 5 The multinomial coefficient is taken to be zero, if any of the "bottom" entries are negative. The other rows of the channel matrix are MN.M[i,j], i > 0. For row i > 0, we have a combinatorial term (j ) for the N clueless senders, aj of which are sending to Ro and N a/ of which are sending to the Ri, i > 0. Therefore, we see that under the uniformity assumption, MN.M[, ( ..... 1, .... (q/M )Na ,i > O . We show the plots of the mutual information when the clueless senders act (as assumed throughout the report) in a semiuniform manner and when Alice also sends in a semiuniform manner (i.e., xi = (1 Xo)/M, i = 1, 2,..., M). We conjecture based upon our intuition, but do not prove, that Alice having a semiuniform distribution of destinations Ri,..., RM when the clueless senders act in a semiuniform manner maximizes mutual information (achieves capacity). This has been supported by all of our numeric computations for capacity. With this conjecture, we can reduce the degrees of freedom for Alice from M to 1 (her distribution A is described entirely by xo), which allows greater experimental and analytical exploration. The channel matrix greatly simplifies when both the clueless senders and Alice act in a '. ,ll;/ ",,:'. ti ,,, manner. That is, when xo 1/(M + 1), then xi = (1 xo)/M 1/(M + 1) for all xi, and p = 1/(M + 1). We have N ajM))ajl/1 )N+l.aj MN.M[O, j a]= a (q/M)N+1 which simplifies to MN..M[0,j] a 1,j l.)M M+ (Note this form for i = 0 is due to the total uniformity of the Cs.). We also have MN.M [i,J ,, J J P (q/M)" ,i > 0 , ao,al,...,ai_ ,ai 1 aai,...,aM which simplifies to MN.M ] > i> (aa ,...,aa_i,a 1 i,a a+,... ,aM 1a Table 1. Lower capacity bounds for N = 0,..., 9, and M = 1,..., 10 M 1 2 3 4 5 6 7 8 9 10 Nt 0 0.3113 1.5849 2.0000 2.3219 2.5850 2.8074 3.0000 3.1699 3.2192 3.4594 1 0.2193 0.9172 1.2500 1.5219 1.7515 1.9502 2.1250 2.2811 2.4219 2.5503 2 0.1675 0.6204 0.8891 1.1204 1.3218 1.4996 1.6586 1.8021 1.9328 2.0529 3 0.1351 0.4555 0.6760 0.8423 1.0515 1.2112 1.3560 1.4882 1.6097 1.7221 4 0.1133 0.3537 0.5371 0.7080 0.8649 1.0090 1.1410 1.2630 1.3761 1.4813 5 0.0976 0.2864 0.4408 0.5893 0.7288 0.8588 0.9798 1.0925 1.1978 1.2965 6 0.0857 0.2392 0.3710 0.5010 0.6255 0.7434 0.8544 0.9587 1.0570 1.1496 7 0.0765 0.2048 0.3187 0.4334 0.5450 0.6522 0.7542 0.8510 0.9428 1.0298 8 0.0691 0.1789 0.2785 0.3803 0.4809 0.5786 0.6726 0.7626 0.8484 0.9303 9 0.0630 0.1587 0.2467 0.3377 0.4288 0.5183 0.6051 0.6888 0.7692 0.8463 To determine the distribution E describing Eve we need to sum over the columns of the channel matrix and use the total uniformity of A. P(E e,) Z P(E e JA )P(A i) 0,...,M . This gives us P(E e ) ) j SM+1a ,a i a a,...a ,+1 a ...,aM From this we can compute the entropy H(E) without too much trouble: H(E)= (M + N (a ..at ) N( log(M+1) log (i t, ., )) However, the conditional entropy is more complicated, but is expressible. Therefore, we wrote Matlab code to calculate the mutual information, which is conjectured to achieve capacity, when both the clueless senders act in a semiuniform manner and Alice acts in a totally uniform manner. Local exploration of nearby points all yield lower mutual information values. Table 1 tabulates the results of numerical calculations of capacities for different combinations of values of N and M using Matlab. We conjecture that when Alice acts in a totally uniform manner (that is every Alice probability is 1/(M + 1)) that capacity is achieved when the p values are the same, and this capacity is the lower bound for all capacities. The table gives capacity with p fixed at 1/(M + 1), which we determined numerically to be less than the capacity for other values of p. 5.3.5 NonUniform Message Distributions Each of the Senders (including Alice) can have different message distributions among the receivers. We consider 80/20 and the more practical "Zipf" distributions and explain each of them with respect to our scenario. Zipf distribution. Zipfs distribution refers to the distribution of occurrence of an relative to its rank 'r'. There are two Zipfs laws: the rankfrequency one and the frequency count one. According to the rankfrequency law, the frequency of the rth largest occurrence of the event is inversely proportional to its rank: fr oc 1/ro This is typically referred to as Zipf's law or Zipf distribution. The rankfrequency plot is a straight line with a slope 0 on a loglog scale. The second law states that the count of events that have a frequency '' in terms of 'f'. It is defined as Cf C l1/f We can easily prove that the second law is a mathematical consequence of the first one. It can also be shown that = 1 + 1/0. We now calculate the message distribution probabilities in Zipf distribution for One Clueless transmitter (N = 1) and five receivers (M = 5) case. The probability distribution is given by: P(clueless send to R1) = c.1/1 P(clueless send to R2) = c.1/2 P(clueless send to R3) = c.1/3 P(clueless send to R4) = c.1/4 P(clueless send to R5) = c.1/5 P(clueless doesn't send a message) = 1 p q The constant c is given by 60p/137 and the new probabilities for sending to various receivers is 60p/137,30p/137, 20p/137, 15p/137, and 12p/137. 80/20 distribution. According to this distribution, II '. of the messages are sent to 211'. of the recipients and the remaining 211'. to ,II'. of the recipients. Let us assume, without loss of generality, that the first M/5 receivers get 1l''. of the messages and the remaining receivers get the other 21' of the messages. The probability distribution of a Clueless transmitter is as follows: p 4/5 P(cluelesssendtoRiVi 1,2,, M/5) M/5 4p M p 1/5 P(cluelesssendtoRVi = M/5 + 1,, M) 5 S4M/5 p 4M P(clueless doesn't send a message) = 1 p q For the probability distribution of Alice, there are three different probabilities: Firstly for not sending a message, secondly for sending to first M/5 messages and the last one for the remaining 4M/5 receivers. 5.4 Summary This chapter presents the capacity analysis of the covert channel scenario. Since the mathematics involved in the analysis is very complex, may simple cases are an alyzed. These include many cases involving combinations of N = 1,2,3,4 additional transmitters and M = 1,2,3 receivers. Based on the observations from the different cases, the channel matrix and the entropy for generalized case is discussed. Finally, Zipf and 80/20 message distributions are considered for Alice and Clueless Transmitters. The results of the calculations presented and generalizations of the results are presented in the next chapter. CHAPTER 6 DISCUSSION OF RESULTS 6.1 Capacity vs. Clueless Transmitters Figure 61 shows the capacity as a function of p with M = 2 receivers, for N = 1, 2, 3, 4 clueless senders. In all cases, the minimum capacity is realized at p = 1/3, and the capacity at p = 1 is log 3. As N increases, the capacity decreases, with the most marked effects at p = 1/3. In Figure 61, the capacity (of course under the semiuniformity assumption for C, which is in force throughout the report)) was determined numerically for any choice of A. However, for the remaining plots, we applied the semiuniformity conjecture (that Alice is better off behaving semiuniformly if that is what the clueless senders do). Thus, xo is the only free variable for Alice's distribution in what follows. 6.2 Capacity vs. Number of Receivers Figure 62 shows the capacity as a function of p with M = 3 receivers, for N = 1, 2, 4 clueless senders. As expected, in all cases, the minimum capacity is realized at p = 1/4, and the capacity at p = 1 is log 4 = 2. As N increases, the capacity decreases, with the most marked effects at p = 1/4. The minimum capacity is greater when compared to corresponding value in the M = 2 case (refer to plot 61). The mutual information as a function of xo is shown in Figure 63 for M = 2 receivers and N = 1 clueless sender for p = 0.25, 0.33, 0.5, 0.67. Here, note that the curve with p = 0.33 has the smallest maximum value (capacity), and that the value of x0 at which that maximum occurs is x0 = 0.33. The x0 value that maximizes the mutual information (i.e., for which capacity is reached) for the other curves is not 0.33, but the mutual information at x0 = 0.33 is not much less than the capacity for any of the curves. Figure 64 shows the mutual information curves for various values of x0 as a function of p, with N = 2 clueless senders and M = 2 receivers. Similarly, Figure 65 65 0.25 0.33 0.5 0.75 p = P(Clueless not sending a message) > Figure 61: Capacity for N 2.0 1.6 cu 03 1.2 0 0.8 0 i 0.4 0 1 lo 4 C ::. i Senders and M = 2 Receivers 0.25 0.33 0.5 0.75 p = P(Clueless not sending a message) > F1:: 6 2: Capacity for AN 1,2,4 C(:: Senders and lM ... 3 Receivers o 0.917 o 00. 0 J F ; 3: Mutual Information vs for p : 0.33, 0.5, 0.67 0.5 x0 > for N 1 C :: Sender and M 2 Receivers. 0 0.250.33 0.5 0.75 p = : > Figure 6 4: Mutual Information vs. p for N 2 ( i::. : Senders and M A 2 Receivers x0=0.20 0=0.10 0.5 x0=0.75 0 0.25 0.5 0.75 1 p = (1q) > Figure 65: Mutual Information vs. p for N = 2 Clueless Senders and M = 3 Receivers shows the mutual information curves for various values of x0 as a function of p, with N = 2 clueless senders and M = 3 receivers. In the figure 64, note that the curve for xo 1/(AM + 1) = 1/3 has the largest minimum mutual information, and also has the greatest mutual information at the point where p = 1, i.e., when there is no noise since Clueless, is not sending any messages. The capacity for various values of p is, in essence, the curve that is the maximum at each p over all of the x0 curves, and the lower bound on capacity occurs at p 1/3 1/(M + 1). Also observe that the x0 = 0.33 curve has the highest value for p = .33, but for other values of p, other values of xo have higher mutual information (i.e., Alice has a strategy better than using x0 = 0.33). However, the mutual information when x0 = 0.33 is never much less than the capacity at any value of p, so in the absence of information about the behavior of the clueless senders, a good strategy for Alice is to just use xo = 1/(M + 1). These observations are illustrated and expanded in the next two figures. Note the differences in concavity between Figure 63 and Figure 64 We will discuss concavity again later in the report. Figure 66 shows the optimal value for x0, i.e., the one that maximizes mutual information and hence, achieves channel capacity, for N = 1, 2, 3, 4 clueless senders and M = 3 receivers as a function of p. A similar graph in [13] for M = 1 receiver is S0.5 N=3  N=2 0.25 N=1 0 0.25 0.5 0.75 1 p = P(Clueless not sending a message) Figure 66: Value of xo that Maximizes Mutual Information for N = 1, 2, 3, 4 Clueless Senders and M = 3 Receivers as a Function of p symmetric about xo = 0.5, but for M > 1 the symmetry is multidimensional, and the graph projected to the (p, xo)plane where the destinations are uniformly distributed is not symmetric. However, note that the optimum choice of xo is 1/(M + 1) both at p = 1/(M + 1) and at p = 1, that is, when the clueless senders either create maximum noise or when they do not transmit at all (no noise). As N increases, the optimum xo for other values of p is further from 1/(M+ 1). Also observe that Alice's best strategy is to do the opposite of what the clueless senders do, up to a point. If they are less likely to send messages (p > 1/(M + 1)), then Alice should be more likely to send messages (xo < 1/(M + 1)), whereas if Cluelessi is more likely to send messages ((p < 1/(M + 1)), then Alice should be less likely to send messages (xo > 1/(M + 1)). 6.3 Capacity vs. Mutual Information at xo = 1/(M + 1) Figure 67 shows the degree to which the choice of xo 1/(AM + 1) can be suboptimal, for N 1= 2, 3,4 clueless senders and M = 3 receivers. The plot shows the mutual information for the given p and xo 1/(AM + 1), normalized by dividing by the capacity (maximum mutual information) at that same p. Hence, it shows the degree to which a choice of xo 1/(MA + 1) fails to achieve the maximum mutual information. For N = 2, it is never worse than 0.94 (numerically), but for N = 4, its minimum is 0.88. The relationship of suboptimality for other choices of M and N, or for other distributions, is not known. 0.25 0.5 0.75 p = P(Clueless not sending a message) Figure 67: Normalized Mutual I:::,: : Senders and Mf 3 : .(ceivers 0 0.25 0.5 0.75 p = P(Clueless not sending a message) > Sender and M = 1 to 5 Rec(eivers 7 II 0 x 1.25 0 0 S1.0 S0.88 N 0.75 0 z N=I  N4= 2 Swhen 1/4 N 1, 2,3.4 Clueless Figure Capacity for N = 1 ( ::: Capacity graph Figure 69: Capacity for N = 0 to 9 Clueless Senders and M = 1 to 10. In Figure 68, we show the lower bound on capacity of the channel as a function of p for N = 1 clueless sender and various values of M receivers. Numerical results show that this lower bound increases for all p as M increases, and the lower bound on the capacity for a given M occurs at p = 1/(M + 1), which is indicated by the dotted lines in the figure. For Figure 69, we take the capacity at p = 1/(M + 1), which we found numerically to minimize the capacity of the covert channel, and plot this lower bound for capacity for many values of N and M. We retain the assumption that xi = (1 xo)/(M + 1) for i = 2,..., M, that is, given the semiuniform distribution of transmissions to the receivers by the clueless senders, it is best for Alice to do likewise. Along the surface where N = 0, we have the noiseless channel, and the capacity is log(M + 1), which is also the upper bound for capacity for all N and M. The values along the surface when M = 1 give us the same values we derived in [13]. 6.4 Capacity vs. Message Distributions In figure 610, we show the lower bound on capacity of the channel for different message distributions of the Clueless transmitter, Alice following the uniform distribu tion. The 80/20 distribution has the highest value of lower bound on capacity, followed by the zipf and the uniform distributions. Notice that the uniform distribution has 4 2 0 4 6 Clueless Transmitters, N > Receivers, M o a 0) 0 1.2 o0 0 J 0 0.2 0.4 0.6 0.8 1 p = P(Clueless not sending any message) > Figure 610: Capacity for Uniform, Zipf, and 80/20 Distributions for Clueless Trans mitter and Uniform Distribution for Clueless Transmitter the lowest capacity bound of the three distribution, indicating that the capacity of the covert channel increases with lesser uniform distributions. Figure 611 shows the mutual information curves, when plotted for various message distributions followed by Alice, with N = 1 clueless sender and M = 4 receivers and the clueless sender following uniform distribution. From the curve, we deduce that Alice has better channel capacity by maintaining the uniform message distribution, when the clueless transmitter is following uniform distribution. The figure 612 confirms the above fact for the case where Clueless sender follows zipf distribution. Calculating Capacity for different message distributions get more and more complicated because of increase in number of variables and more work needs to be carried out in this area. 6.5 Comments and Generalizations We first note that the maximum capacity of this (covert) quasianonymous channel is log(M + 1) for M distinguishable receivers, and is achievable only if there are no other senders (N = 0), or equivalently, if none of them ever send (p = 1), i.e., when the channel is noiseless. Here are some of the observations from the different cases considered, under the semiuniform assumption for the clueless senders and the semiuniform conjecture for Alice, followed by some generalizations. C 0 1.2 8 5 0 0.2 0.4 0.6 0.8 x0 = P(Alice not sending any message) > i : :e 6 11: Capacity for Uniform, i and i:/20 Distributions for Alice and form Distribution for Clueless Transmitter 0 0.2 0.4 0.6 0.8 x0 = P(Alice not sending any message) > Figure 612: Capacity for Uniform, :i and 80/20 n : :: iutions for Alice and Distribution .. Clueless Transmitter The capacity C(p, N, M), as a function of the probability p that a clueless sender remains silent, with N clueless senders and M receivers, is strictly bounded below by C( i N, M), and is achieved with xo 1/(M + 1). The lower bound for capacity for a given number M of receivers decreases as the number N of clueless senders increases, C(M ,N,M) > C( ,N+1,M). The lower bound for capacity for a given number N of clueless senders increases as the number M of distinguishable receivers increases, C( N,M+1) > C(4,N,M). These observations are intuitive, but we have not shown them to be true numeri cally in the general case (we did for the case that M = 1 in our initial publication [13]). It is interesting to note that increasing the number of distinguishable receivers increases the covert channel capacity, which in some sense decreases the (sender) anonymity in the system (Alice has more room in which to express herself). This is a bit contrary to the intuitive view of anonymity in Mix networks, where more receivers tends to provide ,i Ii, r anonymity." In this light, we note that Danezis and Serjantov investigated the effects of multiple receivers in statistical attacks on anonymity networks [?]. They found that Alice having multiple receivers greatly lowered a statistical attacker's certainty of Alice's receiver set. While the graphs and numerical tests support that the v. iI thing the clueless senders can do is to send (or not) with uniform probability distribution over the Ri, i = 0, 2,..., M, we have not proven this mathematically. Nor have we proven that, under these conditions, the best Alice can do is to send (or not) to each receiver Ri with uniform probability, xi 1/(M + 1) for i = 0, 1, 2,..., M, although the numerical computations support this. The proof in [13] of these conjectures for the case where M = 1 relied, in part, on the symmetry about xo = 0.5, which is not the case when M > 1, so another approach must be used. However, we should still be able to use the concavity/convexity results from [13]. Note that our conjecture that the best that Alice can do is to send in a semiuniform manner, and the results illustrated in Figure 8, seem to be an extension of the interesting results of [10]. 6.6 Summary The capacity C(p, N, M), as a function of the probability p that a clueless sender remains silent, with N clueless senders and M receivers, is strictly bounded below 75 by C( N, M), and is achieved with xo = 1/(M + 1). The the lower bound of capacity decreases with increase in Clueless senders and increases with increase in distinguishable receivers. The lower bound for capacity for a given number of receivers decreases as the number of Clueless senders increases. CHAPTER 7 CONCLUSIONS AND FUTURE WORK This thesis has taken a step towards tying the notion of capacity of a quasi anonymous channel associated with an .,ii.. v,:mity network to the amount of anonymity that the network provides. It explores the particular situation of a simple type of timed Mix (it fires every tick) that also acts as an exit firewall. Cases for varying numbers of distinguishable receivers and varying numbers of senders were considered, resulting in the observations that more senders (not surprisingly) decreases the covert channel capacity, while more receivers increases it. The latter observation is intuitive to communication engineers, but may not have occurred to many in the anonymity community, since the focus there is often on sender anonymity. As the entropy H of the probability distribution associated with a message output from a Mix gives the effective size, 2H, of the anonymity set, we wonder if the capacity of the residual quasianonymous channel in an anonymity system provides some measure of the effective size of the anonymity set for the system as a whole. That is, using the covert channel capacity as a standard yardstick, can we take the capacity of the covert channel for the observed transmission characteristics of clueless senders, equate it with the capacity for a (possibly smaller) set of clueless senders with maximum entropy (i.e., who introduce the maximum amount of noise into the channel for Alice), and use the size of this latter set as the effective number of clueless senders in the system. This is illustrated in Figure 61, with the vertical dashed line showing that N = 4 clueless senders that remain silent with probability p = 0.87 are in some sense equivalent to one clueless sender that sends with p = 0.33. The case in which the Mix itself injects dummy messages into the stream randomly is not distinguishable from having an additional clueless sender. However, if the Mix predicates its injection of dummy messages upon the activity of the senders, then it can affect the channel matrix greatly, to the point of eliminating the covert channel entirely. 76 77 We are also interested in the degree to which the Mix can reduce the covert channel capacity (increase anonymity) with a limited ability to inject dummy messages. ]plain REFERENCES [1] Adam Back, Ulf Moller, and Anton Stiglic. Traffic analysis attacks and tradeoffs in anonymity providing systems. In Ira S. Moskowitz, editor, Information Hiding, 4th International Workshop (IH 2001), pages 245257. SpringerVerlag, LNCS 2137, 2001. [2] P. Boucher, I. Goldberg, and A. Shostack. Freedom system 2.0 architecture. http://www.freedom.net/info/whitepapers/, December 2000. ZeroKnowledge Sytems, Inc. [3] David C'!i iiii Untraceable electronic mail, return addresses and digital pseudonyms. Communications of the AC(/, 24(2):8488, 1981. [4] David C'!i iloi The dining cryptographers problem: Unconditional sender and recipient untraceability. Journal of Cryptology: the Journal of the International Association for CrllI.1. ..: Research, 1(1):6575, 1988. [5] L. Cottrell. Mixmaster and remailer attacks, August 1994. http://www.obscura.com/ "loki/remailer/remaileressay .html, August 2004. [6] Claudia Diaz, Stefaan Seys, Joris Claessens, and Bart Preneel. Towards measuring anonymity. In Paul Syverson and Roger Dingledine, editors, Pr'; '; Eu,1,;, ..'.:,j; Technologies (PET 2002). SpringerVerlag, LNCS 2482, April 2002. [7] D. Goldschlag, M. Reed, and P. Syverson. Onion routing for .iri. .rvmous and private internet connections. Communications of the AC'i (USA), 42(2):3941, 1999. [8] C. Giilcii and G. Tsudik. Mixing Email with Babel. In Internet S ... .: I, Symposium on Network and Distributed Sytem .. i,.ii, (NDSS'96), pages 216, San Diego, CA, Feb 1996. [9] D. Kesdogan, J. Egner, and R. Buschkes. StopandgoMIXes providing probabilis tic anonymity in an open system. In Proceedings of the International ITr f., i',,n.. n Hiding Workshop, April 1998. [10] E.E. Majani and H. Rumsey. Two results on binary input discrete memoryless channels. In IEEE International Symposium on Information The .;, page 104, June 1991. [11] Ulf Moeller and Lance Cottrell. Mixmaster Protocol Version 3, 2000. http: //www. eskimo. com/~rowdenw/crypt/Mix/draftmoellerv301 .txt, August 2004. [12] Ira S. Moskowitz and Myong H. Kang. Covert channels here to stay? In Proc. COMPASS'94, pages 235243, Gaithersburg, MD, June 27 July 1 1994. IEEE Press. [13] Ira S. Moskowitz, Richard E. Newman, Daniel P. Crepeau, and Allen R. Miller. Covert channels and anonymizing networks. In AC'_I WPES, pages 7988, Washington, October 2003. [14] Ira S. Moskowitz, Richard E. Newman, and Paul F. Syverson. On i; i1i~'i vmous channels. In IASTED CNIS, pages 126131, New York, December 2003. [15] R. E. NewmanWolfe and B. R. Venkatraman. High level prevention of traffic analysis. In Proc. IEEE/AC'I[ Seventh Annual Computer S.. ii;I, Applications Conference, pages 102109, San Antonio, TX, Dec 26 1991. IEEE CS Press. [16] R. E. NewmanWolfe and B. R. Venkatraman. Performance analysis of a method for high level prevention of traffic analysis. In Proc. IEEE/AC'_[ Eighth Annual Computer S.. i.1' Applications Conference, pages 123130, San Antonio, TX, Nov 30Dec 4 1992. IEEE CS Press. [17] Onion routing home page. http://www.onionrouter.net, August 2004. [18] J. Raymond. Traffic analysis: Protocols, attacks, design issues, and open problems. In Hannes Federrath, editor, Designing P, i.,'; .; FI,.i:. .:,; Technologies: Design Issues in Anoiimi'; and O1, i;,7l.,/,' pages 1029. SpringerV. i. LNCS 2009, July 2000. [19] Michael K. Reiter and Aviel D. Rubin. Crowds: anonymity for web transactions. AC('I Transactions on Information and System S.. iii;', 1(1):6692, 1998. [20] Andrei Serjantov and George Danezis. Towards an information theoretic metric for anonymity. In Paul Syverson and Roger Dingledine, editors, Pr':; ;, Enhacing Technologies (PET 2002). SpringerVerlag, LNCS 2482, April 2002. [21] Andrei Serjantov, Roger Dingledine, and Paul Syverson. From a trickle to a flood: Active attacks on several mix types. In IH 2002, pages 3652, N.... v iijkerhout, the Netherlands, October 2002. [22] Claude E. Shannon. The mathematical theory of communication. Bell S,',l mI Technical Journal, 30:5064, 1948. [23] Claude E. Shannon. The zero error capacity of a noisy channel. IRE Trans. on Inh.,,rl.:i. n Th(.,;, Vol. IT2:S8S19, September 1956. [24] P F Syverson, D M CG .1, 1.1 I and M G Reed. Anonymous connections and onion routing. In IEEE Symposium on S.. 'i;.I, and P, ,. .;l pages 4454, Oakland, California, 47 1997. [25] Paul F. Syverson, Gene Tsudik, Michael G. Reed, and Carl E. Landwehr. Towards an analysis of onion routing security. In Hannes Federrath, editor, Designing Pr',' Ei. ; F,.I'n.. :u, Technologies: Design Issues in Anon;imiii;, and 01,. n ,r.il,/.1 pages 96114. SpringerV i1 . LNCS 2009, July 2000. [26] B. R. Venkatraman and R. E. N. .in i:Wolfe. Transmission schedules to prevent traffic analysis. In Proc. IEEE/AC'MI Ninth Annual Computer S.. n'i1, Applications Conference, pages 108115, Orlando, FL, December 610 1993. IEEE CS Press. 27] B. R. Venkatraman and R. E. N. 'i1 ,iWolfe. Performance analysis of a method for high level prevention of traffic analysis using measurements from a campus network. In Proc. IEEE/ACM' Tenth Annual Computer S., i.ii' Applications Conference, pages 288297, Orlando, FL, December 59 1994. IEEE CS Press. BIOGRAPHICAL SKETCH Vipan Reddy Nalla was born on August 1st, 1981, in Nizamabad, Andhra Pradesh, India. He received his undergraduate degree, Bachelor of Technology, civil engineering, from Indian Institute of Technology, C(', mi, .( Madras), India, in August 2001. He joined the University of Florida in Spring 2003 to pursue his master's degree. His research interests include Network Security and Cryptography with an emphasis on anonymity and covert channels. 