Can Pseudonymity Really Guarantee Privacy?
by: Josyula R. Rao -and- Pankaj Rohatgi

summarized by Jeff Ellen (ellen@uiuc.edu)

Discussion Points
  • What is the motivation for a solution? For privacy, but who's privacy?
  • What is the motivation for problem identification.  Would anyone do this?
  • A fair enough technique and idea, but not very technical/programming/security oriented.
  • Assumes each person has a single style, which is shaky.  If someone was going through all the trouble of multiple pseudonymns (which are needed for the matching), wouldn't they consciously avoid things like this, just as common sense?  Additionally, each person subconciously has a formal style and an informal style, at least, with many other flavors and variants.
  • First few variables capture most of the variance, why bother with all of the extra work?
  • Experiment possibly flawed.  On the ng they grabbed from, people were not attempting to be anonymous.  Therefore, they were not taking any precautions whatsoever (consciously or not) to avoid being repetitive, and perhaps may have in fact been doing the opposite.
  • Seem to fudge data slightly.  Perhaps because of rough draft, but just 2 experiments and then some hand waving "found that at around 6500 words the results are almost as good as for 10000 words"
  • Need to suspect people posting covertly, because technique yields high number of false positives if no others are there.
  • This could be used to analyze newsgroups to see if your employees are posting there and wasting time.
  • RFC's: aren't you guranteed the author?  This isn't exactly Shakespeare.
  • Possible method for extension that we suggest (maybe instead of RFC):  Smiley analysis in chat rooms.
  • What about non-native english speakers?  (in reference to misspellings section)
Partial Conclusions
  • We suspect there might be very interested parties in the FBI/NSA.
  • Also possibly useful for Industrial Espionage
  • Other than that, probably not much mainstream useage, too much effort for joe casual user.
  • Could there have been a way to automate this further?  This paper wasn't too technical, very common sense.
  • Although the conclusions are fairly persuasive, especially given the intuitions, we think it would be very easy to foil this (extremely time-consuming) technique with a little thought.
  • Overall lukewarm reception of paper.