Members
..:: Login

Features
..:: Articles
..:: Wallpapers
..:: Linux Tutorial
..:: Mobile
..:: Motorola A1200 Hacks
The Seven Rules of Trade Practice for Web Publishing
by djlosch
Every now and then, someone who has no idea how the Internet works comes along and files a lawsuit at one of the backbones of the internet that would set horrible precedent. The latest debacle is the woman who sued Archive.org, one of the top mirroring sites. The fact that she knows how to install a basic web publishing program neither makes her an authority on the internet, nor an expert web publisher. Apparently, Mrs. Suzanne Shell has no idea how the internet works.

Please note that this is not a statement of the law; this is not a legal opinion; this is a software engineer's analysis of common industry standards and trade practices in the web publishing industry. If you want a legal opinion, hire a qualified attorney. "I read it on the internet" is a horrible defense to any lawsuit.

..:: Copying, Contracts, and Craziness

The problem is that these people pick up a copy of Frontpage (or NetObjects Fusion 7 for Windows her Mrs. Shell's case) and post a page without knowing any of the trade practices of the internet. Any network transmission ALWAYS results in a copy. Right now, you are not viewing the djlosch.com actual site at this page's address. Your browser came to my site, copied this page to your computer, and is showing you the copy on your computer. It is called a browser cache. That is how the internet has been run since at least ARPANET in 1994, and most likely also on pre-www systems like BBS and Compuserve/Prodigy. So, the inexperienced Shell cluelessly sets up a page on the internet with the following demand listed on the bottom of the front page:
IF YOU COPY OR DISTRIBUTE ANYTHING ON THIS WEB SITE, YOU ARE ENTERING INTO A CONTRACT
Instantly, everyone visiting her site is agreeing to a contract because of the foundations of the internet. Shell had no idea how the internet works. Adding reality to Shell's demand results in a contract always. One cannot visit her site without creating a contract. From the court opinion that dismissed a plurality of Shell's claims:
In Specht v. Netscape Communications Corp., 306 F.3d 17 (2nd Cir. 2002), the court found a website’s terms of use unenforceable where a user had unimpeded access to the website contents and could only become aware of the terms of use by clicking on a separate icon located elsewhere on the website.
The breach of contract motion should have been granted. It is impossible to find out the terms of this contract without automatically agreeing to those terms. It is a basic concept in the common law of contracts that for a contract to exist, there must be an offer, acceptance, and the material terms must be conveyed before the acceptance.

..:: Trade Practices

As a web coder for over 10 years, I can easily list off some of the most prominent trade practices of internet publishing. Hopefully these will assist others who are getting into web-publishing. These are not to be confused with trade practices in web designing that go towards issues like colorspaces, typography, and browser compliance. These issues are specifically for the interaction of hosts, other hosts, and infrastructure systems. Most of these rules are grounded in the unofficial Unenforceability Doctrine of the Internet. The doctrine states that if a host can technologically prohibit others from doing certain actions on the host side, the host has the obligation to do so. Additionally, if the host cannot technologically prohibit others from doing certain actions on the client side, the host cannot complain when users do not comply. This doctrine is seen in effect with the RIAA lawsuits. The RIAA is dumping tons of money into suing 4 year olds, dead people, and people with no computers, all because it refuses to acknowledge the doctrine. Anyways, onto the list:

Rule 1) If you put it on the internet, expect it to be saved, copied, altered, used, and distributed without your permission, and for all eternity.

There's simply no way to stop this. The internet is founded on the notion that EVERYTHING transmitted is an exact copy of the original.
  1. Modern internet browsers are built on caching. They require the target file to be copied.
  2. Many large internet nodes (run by ISPs) will also cache popular large files like video and audio files to speed transmissions and cut bandwidth usage. Doing otherwise would decimate many site's bandwidth amounts.
  3. Search engines are only usable because they store entire sites in the engines' local databases. If the search engines could not do this, it would take years to perform a single search on the web.
  4. Fair use covers a lot of material, even under the DMCA (more on Fair Use later).
It is simply impossible for the internet to run properly without these constructs in place. Doing otherwise would require massive overhauls that the foundation of the internet is simply unable to address. Further, the concept of transmitting content requires that it must be accessible on some level by the user.

Rule 2) Crawling and your best friend, robots.txt

The internet runs because of crawlers. If there were no robots crawling the web, search engines would typically only come up with pornography and freeipod sites. Most internet crawlers by the large organizations obey robots.txt files. If you don't want your content crawled, you need a robots.txt file. Setting one up properly is incredibly easy. When search engines were generally designed as an opt-in approach (cerca 1998), searching for content was simply inaccurate, and generally non-exhaustive. The internet has made huge strides in the past 10 years, and doing so has required the shift to opt-out trade practices. Rights-holders have not been ignored though, as most bots will obey robots.txt files and even NOFOLLOW anchor tags. When one conducts trade on the internet (by publishing anything and receiving any compensation for doing so from anyone), one must be aware of the current trade practices.

Alternatively, one can check the browser type and prevent pages from loading from unacceptable browsers (because crawlers only use specific rendering engines in very specific instances). One can also block off entire domains and IPs preventing bots from reaching the site. Without the industry-wide shift to the opt-out approach, internet searches would take years. The reason Google can respond with 138 million results within .0000025 seconds is because everything on the internet is SAVED on Google's computers. Microsoft, Yahoo, and most other large search engines are exactly the same. The industry standard of the opt-out approach should not be thrown out because some clueless web publisher aspirer disagrees with the tried-and-tested system.

Rule 3) Deep-Linking is inevitable

Many clueless site constructors are getting up in a tizzy because others will deep link to pages on their site. What is deep linking? It's the simple process of creating a link to a website that is not the index page of the site. In other words, http://www.djlosch.com is not a deep link, but http://www.djlosch.com/article_The_Seven_Rules_of_Trade_Practice_for_Web_Publishing is. A perfect example of an issue solely related to deep linking is the Jackass! Stage 1/3 Gentoo Linux Installation Method, a setup system for Gentoo Linux. Bob P, the creator of this customized installation process, obtained some pretty big popularity with his install method, and later, even more attention over the clueless provisions on his site. Of course, viewing his site "requires" agreeing to terms and conditions that cannot be seen until after combing through the site. One of these laughable terms is specifically:
[You may not:] Post a hyperlink that links to any location on our website other than to the index.html file located at http://bobp.homelinux.org.
The rest of Bob P's terms are the same laughable joke that violate multiple other trade practices of web publishing (caching is inevitable, transfer fees, etc).

Often, deep-linking is mixed in with hot-linking because hot-linking is a type of deep-linking. Unfortunately, courts addressing deep linking are in disagreement. The Texas case involving the motorcross videos is actually a combination of deep-linking policies and hot-linking. The defendant, arguing pro se, clearly fell short of making proper arguments. This is yet another reason why one should seek a qualified attorney, especially for cases regarding intellectual property law.

In the meantime, one can easily prevent deep linking with an .htaccess file, or the following php code (not that I'd ever want to do it):
$referer_domain = explode('/', $_SERVER[HTTP_REFERER]);
if ($referer_domain[2] != $_SERVER[HTTP_HOST])
	header("Location: $_SERVER[HTTP_HOST]");
There are a million other ways to implement that, not that doing so is advisable. It would really destroy your pagerank, as links to your site wouldn't necessarily have anything to do with your front page. Since it is so unbelievably easy to protect against, one who actually wishes to injure himself by preventing others from deep-linking should bear the burden for going against industry standards. If one wanted to make a list of those who are allowed to deep link, one could easily do so, but once again, this requires a little knowledge in programming.

Rule 4) Hot-linking is highly frowned upon, but the responsibility to protect against it is on the original host.

Hot-linking is equated with bandwidth theft. The hot-linker finds a picture or other media file that is hosted on another's server, and links directly to that file so that it is displayed on the hot-linker's page. This is deep-linking, but rather than sending the visitor to the hosts site, the visitor remains on the hot-linker's site, usually with no idea that the content is being hot-linked. In the trade practice of internet publishing, hot linking is easily prevented by an .htaccess file. It can also be prevented by custom header scripts like that shown above for deep-linking (because hot-linking is a sub-type of deep-linking). As a result, the hot-linking of a host's content creates the following rights-based relationship:
  1. The hot-linker may continue to use the content as posted, unless the host demands otherwise.
  2. The host is free to change the image by any means including altering the actual image or just headering the referrer to that of goatse, tubgirl, or any other horrible disgusting embarrassment the host can dream up. Even Microsoft has been goatsed for hot-linking.
  3. Upon being goatsed, the hot-linker may usually continue to be goatsed at his option, but probably will discontinue the heinous act after being thoroughly embarrassed. The hot-linker may threaten legal action, only to be further embarrassed when the story ends up on a social networking site like digg or slashdot, because hot-linking is considered theft of bandwidth. The hot-linker is making the threat:
    I'm going to sue you because you altered the content that I was stealing.
  4. Upon the resulting story ending up on digg or slashdot, the hot-linker may further threaten slander/libel/defamation lawsuits, only to be further mocked because what happened is usually all truthful. The hot-linker is making the threat:
    I'm going to sue you because you altered the content that I was stealing, and then told everyone about it. Now, they're making fun of me, DDOSing my site, spamming my mail box, and ringing my phone off the hook because of it, and even though I was stealing your content, this is all your fault.
    The hot-linker may further threaten lawsuits based on damage to business assets, loss of consortium, and other claims. In other words:
    I'm going to sue you because you altered the content that I was stealing, and now no one will let me photograph their wedding after you changed that picture I was using to one of a man holding his bloody anal orifice open.
  5. Ultimately, the hot-linker may not actually sue the host. Doing so would create an even bigger mockery out of the hot-linker's actions, and most likely a frivolous lawsuit counterclaim.
The bottom line is that hot-linking is the real world equivalent of the host leaving all of his money spread across his front lawn, and then complaining when someone steals it. It is the host's obligation to protect his content, but should someone hotlink that content, the host is not without remedy.

Rule 5) Right-click scripts are silly

You've seen these before. Some site from some small time web-developer installs a script that when you right click on the page for whatever reason, you get a javascript popup that usually has some copyright notice. This rule is somewhat of a corollary to Rule 1, as the point is the same. However, the rationale behind this rule is different. Unfortunately for these inexperienced non-geeks, javascript is a joke, and this script has so many exploits that cannot be protected against that disabling of one's right mouse button on a web page is seriously just a nuisance.
  1. All popular browsers allow js to be disabled easily, while FireFox even has the NoScript plugin that manages custom javascript usage.
  2. Anyone can left click, and before they release the left mouse button, hit the right mouse button. Then release the left button, and when the right button releases, the normal options menu will pop up.
  3. Anyone can view the page source from the file bar (in IE, I think View Source is under the Edit menu, while I know FireFox has it in the View menu). Find the file of interest and just copy the url directly into the address bar.
Because of these basic exploits that require little to no technical knowledge, one cannot expect a right-click script to do anything except annoy the visitor, as the right click menu is often used to open pages in new tabs, bookmark pages, copy links, copy email addresses, copy text, and various other things in basic non-infringing uses.

Rule 6) Fair use allows most quotes and re-use of images

Fair use is an immensely hot topic over the past few years after the American DMCA neutered Fair Use, and the Australians had Fair Use nearly abolished altogether. Meanwhile countries across Europe have had their respective Fair Use concepts rewritten by record labels and movie studios to carve out more space for profits to be guaranteed. Either way, the U.S. Copyright Office specifically notes:
Under the fair use doctrine of the U.S. copyright statute, it is permissible to use limited portions of a work including quotes, for purposes such as commentary, criticism, news reporting, and scholarly reports. There are no legal rules permitting the use of a specific number of words, a certain number of musical notes, or percentage of a work. Whether a particular use qualifies as fair use depends on all the circumstances.
Some of the court recognized examples of fair use include:
  • quotation of excerpts in a review or criticism for purposes of illustration or comment;
  • quotation of short passages in a scholarly or technical work, for illustration or clarification of the author's observations;
  • use in a parody of some of the content of the work parodied;
  • summary of an address or article, with brief quotations, in a news report;
  • reproduction by a library of a portion of a work to replace part of a damaged copy;
  • reproduction by a teacher or student of a small part of a work to illustrate a lesson;
  • reproduction of a work in legislative or judicial proceedings or reports;
  • incidental and fortuitous reproduction, in a newsreel or broadcast, of a work located in the scene of an event being reported.
The first bullet point characterizes the majority of fair use claims. Often non-copyrightable material is mistaken as fair use. Facts are not copyrightable as they are in the public domain. This is as old as International News Services v. Associated Press. When Best Buy claims that their price lists are copyrighted, that's a complete crockery. The trade practice in web publishing fully acknowledges very wide interpretations of fair use, despite what many corporations would try to have the public believe.

Rule 7) Expect your visitors to use AdBlock, BugMeNot, and other profit destroying annoyances

I don't care if my visitors use AdBlock. In fact, I encourage them to do so, because I seriously hate ads and would be a ridiculous hypocrite if I argued otherwise. I even outline how to setup MythTV with ad-evasion as one of the main goals. If I was seriously interested in advertisements, I would have implemented one of my many ideas to mitigate AdBlock effectiveness on my site, quite a long time ago. I make a couple of dollars a month on the advertisements from the hordes few people who actually see the ads, and that goes back to lowering my hosting and bandwidth expenses. All in all, this trade practice is grounded in the unenforceability doctrine.

..:: Conclusion

As companies like NetObjects, Adobe/Macromedia, and Microsoft make web publishing available to the average brainless twit, they should be required to let these unfashionably ill-educated folk know what some of the trade practices are on the internet. As more people join the hordes of facebook, livejournal, myspace, blogspot, or set up their own wordpress based site, there are certain things one must know. NONE of these companies are letting the ill-advised know what responsibilities and expectations they should have. If someone gets into an industry and fails to follow the trade practices and industry standards, the bear the burden in EVERY single industry. Software and web publishing should be no different. These trade practices are typically grounded in a host's ability to technologically enforce proposed restrictions mostly because the only set of rules that applies to everything on the internet is the unenforceability doctrine itself. The internet is a worldwide system, subject to laws of varying countries. What is illegal for a user in one country is not necessarily illegal in the next. If a host wants some rights, the host must carve out those rights with technology, not waste time and money litigating in court or lobbying congress.

Post Last Updated: Mar 19, 2007 2:25 am
Social Bookmarking (?)

StumbleUpon

Comments
vaL wrote on Thursday, 12 April '07 - 4:18:29 AM -0400 [reply]
She scammed Internet archives. She has a degree in computer programming.

Add Comment
Name:

Comment:


Please do leave a comment as I love to get feedback from visitors.
  • All fields are required, but your real name is not required.
  • Plain URLs (once again, no HTML or BBcode) will change to clickable links after 72 hours.
  • Comments are of the opinion of their author, not myself, and are not endorsed by myself.
  • Spam will immediately call upon the wrath of the BanHammer.