Back to the Cyberculture Archive IDENTITY, PRIVACY, and ANONYMITY on the INTERNET ================================================ (c) Copyright 1993 L. Detweiler. Not for commercial use except by permission from author, otherwise may be freely copied. Not to be altered. Please credit if quoted. SUMMARY ======= Information on email and account privacy, anonymous mailing and posting, encryption, and other privacy and rights issues associated with use of the Internet and global networks in general. (Search for <#.#> for exact section. Search for '_' (underline) for next section.) PART 1 ====== (this file) Identity -------- <1.1> What is `identity' on the internet? <1.2> Why is identity (un)important on the internet? <1.3> How does my email address (not) identify me and my background? <1.4> How can I find out more about somebody from their email address? <1.5> Why is identification (un)stable on the internet? <1.6> What is the future of identification on the internet? Privacy ------- <2.1> What is `privacy' on the internet? <2.2> Why is privacy (un)important on the internet? <2.3> How (in)secure are internet networks? <2.4> How (in)secure is my account? <2.5> How (in)secure are my files and directories? <2.6> How (in)secure is X Windows? <2.7> How (in)secure is my email? <2.8> How am I (not) liable for my email and postings? <2.9> How do I provide more/less information to others on my identity? <2.10> Who is my sysadmin? What does s/he know about me? <2.11> Why is privacy (un)stable on the internet? <2.12> What is the future of privacy on the internet? Anonymity --------- <3.1> What is `anonymity' on the internet? <3.2> Why is `anonymity' (un)important on the internet? <3.3> How can anonymity be protected on the internet? <3.4> What is `anonymous mail'? <3.5> What is `anonymous posting'? <3.6> Why is anonymity (un)stable on the internet? <3.7> What is the future of anonymity on the internet? PART 2 ====== (next file) Issues ------ <4.1> What is the Electronic Frontier Foundation (EFF)? <4.2> Who are Computer Professionals for Social Responsibility (CPSR)? <4.3> What was `Operation Sundevil' and the Steve Jackson Game case? <4.4> What is Integrated Services Digital Network (ISDN)? <4.5> What is the National Research and Education Network (NREN)? <4.6> What is the FBI's proposed Digital Telephony Act? <4.7> What is U.S. policy on freedom/restriction of strong encryption? <4.8> What other U.S. legislation is related to privacy? <4.9> What are references on rights in cyberspace? <4.10> What is the Computers and Academic Freedom (CAF) archive? Clipper ------- <5.1> What is the Clipper Chip Initiative? <5.2> How does Clipper blunt `cryptography's dual-edge sword'? <5.3> Why are technical details of the Clipper chip being kept secret? <5.4> Who was consulted in the development of the Clipper chip? <5.5> How is commerical use/export of Clipper chips regulated? <5.6> What are references on the Clipper Chip? <5.7> What are compliments/criticisms of the Clipper chip? <5.8> What are compliments/criticisms of the Clipper Initiative? <5.9> What are compliments/criticisms of the Clipper announcement? <5.10> Where does Clipper fit in U.S. cryptographic technology policy? PART 3 ====== (last file) Resources --------- <6.1> What UNIX programs are related to privacy? <6.2> How can I learn about or use cryptography? <6.3> What is the cypherpunks mailing list? <6.4> What are some privacy-related newsgroups? FAQs? <6.5> What is internet Privacy Enhanced Mail (PEM)? <6.6> What are other Request For Comments (RFCs) related to privacy? <6.7> How can I run an anonymous remailer? <6.8> What are references on privacy in email? <6.9> What are some email, Usenet, and internet use policies? Miscellaneous ------------- <7.1> What is ``digital cash''? <7.2> What is a ``hacker'' or ``cracker''? <7.3> What is a ``cypherpunk''? <7.4> What is `steganography' and anonymous pools? <7.5> What is `security through obscurity'? <7.6> What are `identity daemons'? <7.7> What standards are needed to guard electronic privacy? Footnotes --------- <8.1> What is the background behind the Internet? <8.2> How is Internet `anarchy' like the English language? <8.3> Most Wanted list <8.4> Change history * * * IDENTITY ======== _____ <1.1> What is `identity' on the internet? Generally, today people's `identity' on the internet is primarily determined by their email address in the sense that this is their most unchanging 'face' in the electronic realm. This is your login name qualified by the complete address domain information, for example ``ld231782@longs.lance.colostate.edu''. People see this address when receiving mail or reading USENET posts from you and in other situations where programs record usage. Some obsolete forms of addresses (such as BITNET) still persist. In email messages, additional information on the path that a message takes is prepended to the message received by the recipient. This information identifies the chain of hosts involved in the transmission and is a very accurate trace of its origination. This type of identify-and-forward protocol is also used in the USENET protocol to a lesser extent. Forging these fields requires corrupted mailing software at sites involved in the forwarding and is very uncommon. Not so uncommon is forging the chain at the origination point, so that all initial sites in the list are faked at the time the message is created. Tracing these messages can be difficult or impossible when the initial faked fields are names of real machines and represent real transfer routes. _____ <1.2> Why is identity (un)important on the internet? The concept of identity is closely intertwined with communication, privacy, and security, which in turn are all critical aspects of computer networks. For example, the convenience of communication afforded by email would be impossible without conventions for identification. But there are many potential abuses of identity possible that can have very severe consequences, with massive computer networks at the forefront of the issue, which can potentially either exacerbate or solve these problems. Verifying that an identity is correct is called `authentication', and one classic example of the problems associated with it is H.G. Well's ``War of the Worlds'' science fiction story adapted to a radio broadcast that fooled segments of the population into thinking that an alien invasion was in progress. Hoaxes of this order are not uncommon on Usenet and forged identities makes them more insidious. People and their reputations can be assaulted by forgery. However, the fluidity of identity on the internet is for some one of its most attractive features. Identity is just as useful as it is harmful. A professor might carefully explain a topic until he finds he is talking to an undergraduate. A person of a particular occupation may be able to converse with others who might normally shun him. Some prejudices are erased, but, on the other hand, many prejudices are useful! A scientist might argue he can better evaluate the findings of a paper as a reviewer if he knows more about the authors. Likewise, he may be more likely to reject it based on unfair or irrelevant criteria. On the other side of the connection, the author may find identities of reviewers useful in exerting pressure for acceptance. Identity is especially crucial in establishing and regulating `credit' (not necessarily financial) and `ownership' and `usage'. Many functions in society demand reliable and accurate techniques for identification. Heavy reliance will be placed on digital authentication as global economies become increasingly electronic. Many government functions and services are based on identification, and law enforcement frequently hinges on it. Hence, employees of many government organizations push toward stronger identification structures. But when does identification invade privacy? The growth of the internet is provoking social forces of massive proportions. Decisions made now on issues of identity will affect many future users, especially as the network becomes increasingly global, universal, widespread, and entrenched; and the positive or adverse affects of these actions, intended and inadvertent, will literally be magnified exponentially. _____ <1.3> How does my email address (not) identify me and my background? Your email address may contain information that influences people's perceptions of your background. The address may `identify' you as from a department at a particular university, an employee at a company, or a government worker. It may contain your last name, initials, or cryptic identification codes independent of both. In the US some are based on parts of social security numbers. Others are in the form 'u2338' where the number is incremented in the order that new users are added to the system. Standard internet addresses also can contain information on your broad geographical location or nationhood. However, none of this information is guaranteed to be correct or be there at all. The fields in the domain qualification of the username are based on rather arbitrary organization, such as (mostly invisible) network cabling distributions. The only point to make is that early fields in the address are more specific (such as specific computer names or local networks) and the later ones the most general (such as continental domains). Typically the first field is the name of the computer receiving mail. Gleaning information from the email address alone is sometimes an inspired art or an inconsistent and futile exercise. (For more information, see the FAQs on email addresses and known geographical distributions below.) However, UNIX utilities exist to aid in the quest (see the question on this). Common Suffixes --------------- .us United States .uk United Kingdom .ca Canada .fi Finland .au Australia .edu university or college .com commercial organization .org 'other' (e.g. nonprofit organization) .gov government .mil military site _____ <1.4> How can I find out more about somebody with a given email address? One simple way is to send email to that address, asking. Another way is to send mail to the postmaster at that address (i.e. postmaster@address), although the postmaster's job is more to help find user ID's of particular people given their real name and solve mail routing problems. The sysadmin (i.e. `root@address') may also be able to supply information. Users with related email address may have information. However, all of these methods rely on the time and patience of others so use them minimally. One of the most basic tools for determining identity over the internet is the UNIX utility 'finger'. The basic syntax is: finger user@here.there.everywhere This utility uses communication protocols to query the computer named in the address for information on the user named. The response is generated completely by the receiving computer and may be in any format. Possible responses are as follows: - A message `unknown host' meaning some aspect of the address is incorrect, two lines with no information and '???'. - A message 'In real life: ???' in which case the receiving computer could not find any kind of a match on the username. The finger utility may return this response in other situations. - A listing of information associated with multiple users. Some computers will search only for matching user IDs, others will attempt to find the username you specified as a substring of all actual full names of users kept in a local database. At some sites 'finger' can be used to get a list of all users on the system with a `finger @address'. In general this is often considered weak security, however, because `attackers' know valid user ID's to `crack' passwords. More information on the fields returned by `finger' is given below. More information on `finger' and locating people's email addresses is given in the email FAQ (such as the WHOIS lookup utility). Just as you can use these means to find out about others, they can use them to find out about you. You can `finger' yourself to find out what is publicly reported by your UNIX system about you. Be careful when modifying `finger' data; virtually anyone with internet access worldwide can query this information. In one famous case, the New York Times writer J. Markoff uncovered the identity of R. Morris, author of the Internet Worm, through the use of `finger' after an anonymous caller slipped by revealing his initials which were also his login ID. See the book Cyberpunk by K. Hafner and J. Markoff. _____ <1.5> Why is identification (un)stable on the internet? Generally, identity is an amorphous and almost nonexistent concept on the Internet for a variety of reasons. One is the inherent fluidity of `cyberspace' where people emerge and submerge frequently, and absences are not readily noted in the `community'. Most people remember faces and voices, the primary means of casual identification in the 'real world'. The arbitary and cryptic sequences of letters and digits comprising most email addresses are not particularly noticeable or memorable and far from a unique identification of an individual, who may use multiple accounts on multiple machines anywhere in the world. Currently internet users do not really have any great assurances that the messages in email and USENET are from who they appear to be. A person's mailing address is far from an identification of an individual. - Anyone with access to the account, e.g. they know the password, either legitimately or otherwise, can send mail with that address in the From: line. - Email addresses for an individual tend to change frequently as they switch jobs or make moves inside their organizations. - As part of current mailing protocol standards, forging the From: line in mail messages is a fairly trivial operation for many hackers. The status and path information prepended to messages by intermediate hosts is generally unforgeable. In general, while possible, forgeries are fairly rare on most newsgroups and in email. Besides these pathological cases abve there are many basic problems with today's internet protocols affecting identification on the internet: - Internet mail standards, described in RFC-822, are still evolving rapidly and not entirely orderly. For example, standards for mail address `munging' or `parsing' tend to vary slightly between sites and frequently mean the difference between finding addresses and bouncing mail. - Domain names and computer names are frequently changed at sites, and there are delays in the propagation of this data. - Addresses cannot be resolved when certain critical computers crash, such as the receiving computer or other computers involved in resolving names into addresses called `nameservers'. - A whole slew of problems is associated with `nameservers'; if they are not updated they will not find name addresses, and even the operation of what constitutes `updating' has different interpretations at different sites. The current internet mailing and addressing protocols are slightly anachronistic in that they were created when the network was somewhat obscure and not widespread, with only a fraction of the traffic it now sees. Today a large proportion of internet traffic is email, comprising millions of messages. _____ <1.6> What is the future of identification on the internet? Some new technologies and standards are introducing facial images and voice messages into mail and these will improve the sense of community that comes from the familiarity of identification. However, they are not currently widespread, require large amounts of data transfer, standardized software, and make some compromises in privacy. Promising new cryptographic techniques may make 'digital signatures' and 'digital authentication' common (see below). Also, the trend in USENET standards is toward greater authentication of posted information. On the other hand, advances in ensuring anonymity (such as remailers) are forthcoming. See below. PRIVACY ======= _____ <2.1> What is `privacy' on the internet? Generally, while `privacy' has multiple connotations in society and perhaps even more on the internet, in cyberspace most take it to mean that you have exclusive use and access to your account and the data stored on and and directed to it (such as email), and you do not encounter arbitrary restrictions or searches. In other words, others may obtain data associated with your account, but not without your permission. These ideas are probably both fairly limiting and liberal in their scope in what most internet users consider their private domains. Some users don't expect or want any privacy, some expect and demand it. _____ <2.2> Why is privacy (un)important on the internet? This is a somewhat debatable and inflammatory topic, arousing passionate opinions. On the internet, some take privacy for granted and are rudely surprised to find it tenuous or nonexistent. Most governments have rules that protect privacy (such as the illegal search and seizure clause of the U.S. constitution, adopted by others) but have many that are antithetical to it (such as laws prohibiting secret communications or allowing wiretapping). These rules generally carry over to the internet with few specific rules governing it. However, the legal repercussions of the global internet are still largely unknown and untested (i.e. no strong legal precedents and court cases). The fact that internet traffic frequently passes past international boundaries, and is not centrally managed, significantly complicates and strongly discourages its regulation. _____ <2.3> How (in)secure are internet networks? - `Theoretically' people at any site in the chain of sites with access to hardware and network media that transmits data over the Internet could potentially monitor or archive it. However, the sheer volume and general 'noise' inherent to this data makes these scenarios highly improbable, even by government agencies with supposedly vast funding and resources. - Technologies exist to `tap' magnetic fields given off by electrical wires without detection. Less obscurely, any machine with a network connection is a potential station for traffic detection, but this scenario requires knowledge and access to very low-level hardware (the network card) to pursue, if even possible. - A company Network General Inc. is one of many that manufactures and markets sophisticated network monitoring tools that can 'filter' and read packets by arbitrary criteria for troubleshooting purposes, but the cost of this type of device is prohibitive for casual use. Known instances of the above types of security breaches at a major scale (such as at network hubs) are very rare. The greatest risks tend to emerge locally. Note that all these approaches are almost completely defused with the use of cryptography. _____ <2.4> How (in)secure is my account? By default, not very. There are a multitude of factors that may reinforce or compromise aspects of your privacy on the internet. First, your account must be secure from other users. The universal system is to use a password, but if it is `weak' (i.e. easy to guess) this security is significantly diminished. Somewhat surprisingly and frighteningly to some, certain users of the system, particularly the administrator, generally have unlimited access regardless of passwords, and may grant that access to others. This means that they may read any file in your account without detection. Furthermore, not universally known, most UNIX systems keep fairly extensive accounting records of when and where you logged in, what commands you execute, and when they are executed (in fact, login information is usually public). Most features of this `auditing' or `process accounting' information are enabled by default after the initial installation and the system administrator may customize it to strengthen or weaken it to satisfy performance or privacy aims. This information is frequently consulted for troubleshooting purposes and may otherwise be ignored. This data tracks unsuccessful login attempts and other 'suspicious' activities on the system. A traditional part of the UNIX system that tracks user commands is easily circumvented by the user with the use of symbolic links (described in 'man ln'). UNIX implementations vary widely particularly in tracking features and new sophisticated mechanisms are introduced by companies regularly. Typically system adminstrators augment the basic UNIX functionality with public-domain programs and locally-developed tools for monitoring, and use them only to isolate `suspicious' activity as it arises (e.g. remote accesses to the 'passwd' file, incorrect login attempts, remote connection attempts, etc.). Generally, you should expect little privacy on your account for various reasons: - Potentially, every keystroke you type could be intercepted by someone else. - System administrators make extensive backups that are completely invisible to users which may record the states of an account over many weeks. - Erased files can, under many operating systems, be undeleted. - Most automated services keep logs of use for troubleshooting or otherwise; for example FTP sites usually log the commands and record the domain originations of users, including anonymous ones. - Some software exacerbates these problems. See the section on ``X Windows (in)security''. Indepedent of malevolent administrators are fellow users, a much more commonly harmful threat. There are multiple ways to help ensure that your account will not be accessed by others, and compromises can often be traced to failures in these guidelines: - Choose a secure password. Change it periodically. - Make sure to logout always. - Do not leave a machine unattended for long. - Make sure no one watches you when you type your password. - Avoid password references in email. - Be conservative in the use of the .rhost file. - Use utilities like `xlock' to protect a station, but be considerate. Be wary of situations where you think you should supply your password. There are only several basic situations where UNIX prompts you for a password: when you are logging in to a system or changing your password. Situations can arise in which prompts for passwords are forged by other users, especially in cases where you are talking to them (such as Internet Relay Chat). Also, be aware that forged login screens are one method to illegitimately obtain passwords. (Thanks to Jim Mattson |