On Thursday, we met to write some more of our implementation code. We debated how to best protect against a bystander seeing multiple tweets to the same identifier. We came up with the following solutions:
Use multiple hash routines: MD5, SHA128, SHA256, etc
Now a group name has X different identifiers
Still will have messages sent to same identifiers, since X+1 messages can be sent
MAC with N different seeds known to everyone Same problem as multiple hash routines
MAC with next seed in cipher-text message
Guaranteed to be difficult to find multiple messages to same group
What if twitter doesn’t post one of the tweets
MAC with epoch as the seed
That way, only messages within the same epoch will have same identifier
Now only can search for messages within an epoch
Twitter fiddling with timestamps isn’t a real problem
Defends against DDOS
MAC with seed in plaintext of tweet
Twitter can muck with this
Search now has to compute a MAC for every tweet
On Wednesday, today, we met and created our presentation. We started by writing up an outline of all the topics we wanted to cover: Project Goals, Threat model, Security Considerations, Identification, Privacy, Integrity, Encryption scheme, 140 characters, and Implementation. We then created a google docs presentation. You can view the PDF for the presentation here.
We met on Thursday and made fantastic progress. We finalized on our 1st Attemp Crypto technique, which will depend on a shared secret and use AES encryption. The shared secret will be the private group name, which we also will use the hash of as the identifier.We feel this is safer now that we will be only publishing the first 8 or so digits of the hash, so an attacker will have little extra information.
We both like this method better than having an offline shared secret, because now fundamentally you only have to know the name of the group to read and write private messages to it since the name of the group is also the password.
We have implemented a python script that takes a private group name and a plain text message and generates an encrypted message with a identifier attached, ready to be posted to Twitter. We also have the decrypt version of the script that takes an encrypted message from twitter and decrypts it with a given group name.
For this milestone, we narrowed our goal from encrypting messages to multiple people via @replies, to instead addressing #hash tags. That is, a message will now be encrypted when a user types something like:
Hey guys. Good luck on your milestone paper! #!!comp527-dudes
#hash tags are used frequently on twitter to post messages to groups of people. We will recognize when #!! is used in front of a tag. This denotes that the following message was intended for the comp527-dudes group and to encrypt the message so only they can read it.
The service now has two goals: encryption and tag identification. Tag identification allows followers of the comp527-dudes group to be notified when they have received an encrypted message. Encryption will actually encrypt the given tweet so only group members can read it.
We assume that encrypted hash group names will remain private. That is, no one outside of the group will know that the group’s tag is comp527-dudes. Essentially the hash tag acts as a shared password that everyone privately agrees upon as the identifier for the group. After our system processes a tweet, the following will actually be tweeted:
plain-text = msg, time, author
nonce, hash<key>, enc<plain-text, key * nonce>
The message that will be sent out to the Twitter public timeline begins with a random nonce to prevent replay attacks. Next, is a hash of the key. We will discuss where the key comes from later. For now assume that the key is a shared secret that only the members of comp527-dudes knows. We can now identify whether an encrypted public tweet was intended for the comp527-dudes group because we know the key, so we also know the hash of the key. Thus, we can search the timeline for that string.
Then, we have the cipher text, which is encrypted with the key multiplied by the nonce. The plain-text of the encryption is simply the message, the date and time it was posted, and the author. We include these details in the message to further prevent replay, but to also prevent system tampering. We cannot trust Twitter. Twitter cannot read the contents of the message but they can alter when messages are posted or even who posts a message. By encrypting these details, we allow the recipient to verify the authenticity of a message. In the next milestone, we would like to replace the author with digital signatures to prove authorship.
We have debated many choices for how to pick what the key could be. One approach is to use the group name as the key. Previously, we did assume that only comp527-dudes members knew the name of the group, but this approach is fragile. For instance, a group member might accidentally tweet:
Hey guys. You rock! #~~comp527-dudes
Since #!! did not appear, this message will not be encrypted and everyone will know the group and the key.
For this milestone, we are having the key be an offline, shared secret that every member of the group knows. There is an association between group name and the shared key, but the key is not derivable, so if the group name is discovered, the key and all messages are not readable.
In the next milestone, we will shift to the use of asymmetric keys. Right now, if the key is cracked, and attacker can both decrypt and encrypt new messages. Unfortunately, the easy asymmetric solution requires sharing the private key, and that also is a vulnerability. It may be unavoidable however. Furthermore, our encryption currently makes no guarantee as to the size of the message. Ideally, we would like to break a message up into segments so that we can fit in the character limit. Some space is going to be taken up by the nonce, the tag, the time and author. Furthermore, we need to consider how to be able to detect chunks of the message being removed or duplicated, and prevent an attack from stopping us from reading the message at all.
Since one of us is out interviewing this week, there was not much progress made in terms of realizing our proposal. However, I download and installed GPG and starting familiarizing myself with the command line tool. I also downloaded some GUI applications that will made quick testing easy. Finally I found a python library that will allow us to search twitter for tweats, something I expect will come in handy once we get this project moving. I’m working on generating some documentation so that when my teammate comes back, he can quickly pick up what I’ve discovered and we can jump straight into prototyping. Our first milestone will be to tag a message in order to make is searchable by only the desired party.
Twitter provides a secure channel to send messages from one user to another via DMs (Direct Messages). However, if a user wants to send a message privately to multiple twitter contacts, he must send the identical message multiple times to each contact. We want to create a method for a user to publicly transmit a message once to multiple recipients but still maintain privacy such that only the recipients may read the message. We will use the public tweet mechanism to post a message from a given user, restricted to 140 characters, which will be cypher-text that only desired recipients can decrypt. We believe we can both ensure privacy of the contents of the message, but also the list of recipients of the message.
As tweets are broadcast by millions of users, we can think of any particular encrypted message as the metaphorical needle in the haystack. In essence, then, we are attempting to create a system wherein the needle responds to a specified set of magnets that the intended recipients can wield to find it, while the rest are left to manually search through the whole stack blind. While we cannot prevent an adversary from learning that a given user has sent an encrypted message (since we are using the public tweet mechanism), the ability to broadcast a message through public channels that only the intended recipients can search for and decrypt is valuable in and of itself as communication systems, especially social networks, become more and more public. Although many such systems provide means by which to be “private” about group messages, sometimes that option is not available, and at other times, we would like to prevent the very facilitator of the communication, or another supervisory authority, from being able to read these messages. Furthermore, our system would allow any user to broadcast a message addressed to someone not connected to them explicitly (ie. as a Twitter friend), thereby providing another level of anonymity.
We consider the following when approaching this problem. First, twitter strictly enforces a message limit to only 140 characters. Therefore, we have a strict limit on the size of our cipher text message. This presents an atypical problem for cryptographic systems, which rarely have such a strict character limit. Preliminary, we have decided on a few solutions to deal with this character limit:
We must also consider that a message must not only be encrypted but also somehow tagged with its recipients. It makes no sense to encrypt a tweet and publicly broadcast it without somehow informing recipients that it has been sent out. The naive option would be for a user to scan all public tweets and attempt to decrypt them. Messages that successfully decrypt then must have been sent to this user. This results in way too much computation. Instead, we would like to tag each encrypted message with its recipients. The twitter way to do this would be like:
@john @sawyer @ben <x781nnd4815162342ciphertextA2J9I3R0A316>
This method, however, informs the whole world that the user is sending an encrypted message to John, Sawyer, and Ben. This is obviously bad. We would like to intermix identifiers in the cipher-text that only John, Sawyer, and Ben would realize.
<1nsf93[John Identifier]sf9323n[Sawyer Identifier]nfkad8fu023[Ben Identifier]sfnm13l
This way, no one has any idea who this message is sent to. But what if John noticed Sawyer’s identifier. Then for any message that contained Sawyer’s identifier, John would know it was sent to him. Therefore, we need to create an identification protocol that changes with each message sent. That way, even if John figured out Sawyer’s identifier, it would change for the next message, so nothing is learned.
The idea of encrypting messages for multiple recipients is similar to PGP. We will most likely incorporate PGP technology into our work. The incredibly naive way to solve this problem would be for everyone to agree to some secret key between each Twitter user. This turns into an n! problem. With millions of Twitter users, we must find a better way to deal with keys. Some ideas that might be useful:
The second idea, if possible to implement, would be fantastic because it would require no modification to the twitter service. However, it suffers from the problem that if I as @JamesCasey could get my private key from my user name, anyone else could. There would need to be some other secret that only I as @JamesCasey would know like my Twitter password that could be used to create the private key. This suffers from having to have each Twitter user authenticate through our service, so it will likely not be used, but it is something interesting to explore.
Finally, we could take advantage of the idea of friends on Twitter to only search through your friends/followers to look for potential encrypted messages, but ideally we would like anyone on Twitter to be able to send anyone else an encrypted message regardless of whether they are connected. This could make the problem much harder.
In the previous sections we have listed many different alternatives to solving this problem. We hope to explore each of these. If we are able to successfully solve the problem and explore these possibilities, we would also like to do the following: