Dr. Reddy's Pediatric Office on the WebTM


(Medical) Web Sites and Privacy

Children's Health Pediatric Resources Fun Sites for Kids HP Palmtops Dr. Reddy's Home Page Feedback Our Real Office

Especially with the advent of E-commerce, privacy has become an important issue for everyone using the Internet -- whether or not they know it, which is one of the biggest issues in Internet privacy.

This page is not meant to be a detailed explanation of Internet security and privacy issues. However, I will talk about some of the technical side of the Internet -- I have to, in order to explain how you can help protect yourself from invasions of privacy. I can't describe all of the possible threats, but I will talk about a couple of the common ones. I have also described what my Web server keeps track off -- which isn't much at all, and certainly not enough to identify you as a visitor.

The Minimum Information a Web Surfer Gives Out

To borrow a term from its designers, the World-Wide Web is stateless.

This has nothing to do with the international reach of the Internet -- it just means that Web servers do not normally remember browsers from one request to the next. As a matter of fact, a server will not necessarily remmeber your browser even during a single page request: even a page as simple as this one requires several requests to the server, one for the HTML itself and one for each image on the page. Web servers simply don't have enough storage space or computing capacity to remember every request in a useful manner.

There is a record kept of every access to a Web site. Every Web server keeps an access log and an error log for maintenance and debugging. However, the server log contains very little that could be used to identify the user who accessed the site. A typical access log entry looks like this:

123.456.789.012 - - [00/Jun/2000:24:00:00 -0400] www.drreddy.com "GET /shots/fifth.html HTTP/1.0" 200 0000 "http://www.google.com/search?q=fifth+disease&meta=lr%3D%26hl%3Den&btnG=Google+Search" "Mozilla/4.06 [en] (Win98; I)"

(This is a real entry from my server log -- except that I have changed enough information that the person whose access is recorded cannot be identified.) The most important pieces of information here are:

123.456.789.012
the user's Internet Protocol, or IP address, which identifies the computer used to access the page. This does not identify the user directly, unless the user has permanently registered that IP address. Most Internet users get a different IP address from their Internet Service Provider (ISP) every time they go on the Internet. Your ISP probably has records of when you log on and log off, and what IP address you were assigned for that session, but I wouldn't want to try and get that information without a very good reason -- or without a subpoena, at least in the United States. (The example, by the way, is not a real IP address, as most Internet/Web gurus will tell you instantly.)
[00/Jun/2000:24:00:00 -0400]
the date and time of the access; usually this is in the user's local time, and is followed by the difference between the user's local time and Greenwich Mean Time (the Internet standard).
www.drreddy.com
the server to which the request was addressed. Many Web sites are hosted by ISPs who put several sites on a single server computer. The server name is here so that the system administrators and Webmasters can find the entries for particular sites.
"GET /shots/fifth.html HTTP/1.0"
the file requested by the user, and the Internet protocol used to access that file. For almost all Web documents the protocol is HTTP.
"http://www.google.com/search?q=fifth+disease&meta=lr%3D%26hl%3Den&btnG=Google+Search"
the URL of the referring page -- where the user found a link to the page being fetched from this server. This is how Webmasters figure out how people are finding out about their site. The referring URLs could be used to track a user's course through the Internet, but only if the "sleuth" had access to every server on the trail -- and again, at least in the United States, that would likely require a very good reason and a (large collection of) subpoena(s).
"Mozilla/4.06 [en] (Win98; I)"
the browser being used to access the page. Mozilla is the "internal" name for Netscape browsers, although Microsoft Internet Explorer's browser ID usually starts with "Mozilla" as well (followed by "compatible" and the letters "IE").
This information cannot be blocked; every Web server expects it, and complains if it doesn't get it. However, unless you can access the user's ISP's logs (which generally can't be done without legal action) you cannot identify the user by name. Therefore, you as user have at least some anonymity.

Asking You for Information -- Passwords

The easiest way for a Webmaster to gather information on users is to limit access to the site to "authorized" users. These users are given user IDs and passwords, which they must then use before the server will send them restricted pages.

When you enter an ID and password on your browser, your browser saves them for as long as you have the browser running. (You can clear IDs and passwords by closing the browser program, then restarting it. Internet Explorer offers users the option of storing the ID and password on your computer for use in future sessions.) Being stateless, the server does not remember your ID and password for each access (the ID is recorded in the server log, but not the password) -- but the browser sends your ID and password as part of every request to that server.

The privacy risk of ID/password identification depends on what information you have to give the server -- and the Webmaster -- to obtain an ID. Often all you need to give is your E-mail address (but then you need to think about what the Webmaster may do with that information... when I get E-mail addresses from visitors I use them only to reply to their questions, but we've all heard of Webmasters collecting E-mail addresses for spam lists). Some sites "require" a lot more information, and you need to think carefully about what information you're giving out -- especially since the passwords are not encoded, and so can be captured by anyone smart enough to intercept Internet transmissions.

Coded IDs in URLs

A fairly popular way to track users is to generate a new home page for everyone who visits a site. The page looks the same to every visitor, but each page's links to other pages on the site contain a string of letters and numbers unique to that page. If you bookmark one of those coded-URL pages, the server will know you're back every time you use the bookmark, and can use the coded-URL information to track your travels through the site.

This kind of tracking is not easy to block completely. One way to block it is to go to the home page every time you use the site -- and make sure you reload the home page for every set of accesses to get different coded URLs. Of course, this makes using that site much less convenient: you need to balance your desire for privacy against the effort needed to insure it (but then, that could be said of Internet use these days, too...).

Cookies

The cookie (or magic cookie in Netscape parlance) is a great way to make Internet use -- and E-commerce -- convenient. It is also one of the most significant threats to the privacy of unwary Web surfers.

A cookie is a (usually coded) string of letters and numbers that a Web server sends to your browser when the browser requests a page (or image, or sound file, or any other file). Your browser stores this string in a "cookie jar" -- a file on your computer (Netscape browsers use the file COOKIE.TXT). After the cookie is set, every time your browser gets a page, image, or anything else from that server the cookie is sent with the request.

There are some restrictions on who can get what out of the cookie jar. Browsers will send cookies only to the server that set them, or to other servers in the same domain. However, that still leaves a lot of latitude as far as cookie-linked identifying information is concerned.

Cookies can be very useful. For example, some sites use cookies to store encoded passwords for use in future sessions (one example is the New York Times Web site, which also uses cookies to identify "premium" users with access to paid features). However, they can easily be used to track your Internet use. One now famous example is the Web advertising service doubleclick.net, which sets or reads a cookie every time it displays an ad on a Web page. Since the request to the ad server includes the URL of the page the ad appeared on, the ad server can compile a list of every site you visit that carries their ads. This allows them to tailor their ads to your tastes. It also gives them a lot of information about you, just by looking at the sites you visit.

Tasting Cookies, or How to See What's In the Jar

You can read the COOKIE.TXT file (or its equivalent) with any text editor. Modifying the cookie file directly is risky, since some of the characters in the file are non-printing "control" characters and disturbing the contents may make the entire file unreadable. There are also commercial and shareware programs available that will allow you to read the cookie file and even modify the cookies.

You can also set most advanced browsers (Netscape versions 3.0 and higher, and Internet Explorer from at least version 4.0 on) to warn you when a cookie is being set, and ask you if you want the cookie saved or not. (In Netscape version 3.x, select "Network Preferences" from the Options menu, then select the "Protocols" tab and check the box marked "Show an Alert Before Accepting a Cookie". In Netscape version 4.x, select "Preferences" from the Edit menu, then select Advanced and check the "Warn Me Before Accepting a Cookie" box.) Once you do this, you will get an alert box whenever a cookie is set: the box will show you what the cookie contains, the name of the server that wants to set the cookie, and the servers that can read the cookie if you allow it to be set. It will then ask you if you want the cookie. If you do, click "OK"; if you don't, click "Cancel".

Closing the Cookie Jar

If you are really concerned about privacy, you can also block all cookies from your computer.

Browsers store cookies for a particular session in working memory; typically (although this may change) they do not write the new or changed cookies to a file until you close the browser at the end of a session. If you set the cookie file to be "read-only" (with the ATTRIB command in DOS, or using the File Manager or Explorer in Windows -- I don't know Macs well enough to tell you how to do it there), you will prevent your browser from saving new or changed cookies.

You may want to allow a few cookies, and block the rest. To do this, you can

This will let the browser save all the cookies you want to keep, but prevent it from saving others at all. If you later find a site whose cookies you want to keep, just repeat the process for that site.

What do We Keep Track Of?

Not much...

I do keep a condensed version of the server log, which has only one entry per user no matter how many pages and images you access in a given session. I keep track of what page each user visits first in a session (that is the best way for me to see what pages are popular); I do not regularly check to see which sites refer users to my site (although I may start to do that just to see where people find out about my site). I do get E-mail addresses from people who write to me with comments and questions -- I need the addresses to reply to mail, but I do not keep the addresses except in my mailbox. I'm sufficiently concerned about privacy that I wouldn't want to collect and store any other information -- and if I were visiting another site, I wouldn't want other information about me captured, either. There really isn't any good reason that I can see to collect identifying information about the visitors to a medical Web site, especially without their permission. (You might want to think about that when you're surfing medical sites -- or any other Web sites...)


Search the Office for:

Results

See the Detailed Search page for complete instructions on searching the Office.

Back to Dr. Reddy's Pediatric Office on the Web
Sources We Use in the Office
We welcome your comments and questions.

PLEASE NOTE: As with all of this Web site, I try to give general answers to common questions my patients and their parents ask me in my (real) office. If you have specific questions about your child you must ask your child's regular doctor. No doctor can give completely accurate advice about a particular child without knowing and examining that child. I will be happy to try and answer general questions about children's health, but unless your child is a regular patient of mine I cannot give you specific advice.

We comply with the Health On the Net Foundation
HONcode standard for trustworthy health information.
Click here or on the seal to verify.

Copyright © 2000 Vinay N. Reddy, M.D. All rights reserved.
Written 06/15/00; last revised 06/15/00; last reviewed 05/07/07 counter