Court ruling regarding google & copyright infringment -- important!

Leva · Jan 26, 2006

http://news.com.com/2061-10812_3-6031266.html

This is NOT good news for authors.

The case, in brief, involved an author who put a story on his website, and then removed it. Before he removed it, google's search bot (googlebot) crawled his site, and google saved the file & then presented it as "cached" material in google searches.

The author sued, claiming he had not given google permission to repost the material on their server. Google won in court, because the author had not used a "no cache" or "no archive" tag, telling Google NOT to do that.

I very sincerly hope this ruling is appealed and overturned. It's NOT a writer friendly ruling. Basically what the court ruled was that to prevent copyright theft, you have to actually tell the party, using THEIR rules, not to take the article and republish it.

The way Google's cache works is that, when Google crawls a Web site, it "saves" a copy of the Web site as it looks at that instant in time. When you do a Google search, you're given two options -- a link to the current web site or a link to Google's cached site, which is hosted (published!) on Google's server(s). Mind, the saved page may have broken links. If it's an old version, it may have outdated information.

If you look at a cached version of a Web site on Google, it's got Google's information at the top. So not only is it theft, they're branding YOUR work with THEIR trademark.

And moreover, say you sell exclusive pub rights to XYZ Writer's E-zine. They have paid $$$ for exclusive pub rights to your article. And along comes Google, and without paying either you or the E-zine, they steal your article and put it on THEIR site. And they don't pay you a dime. This is annoying under any circumstance. In my case, I make $$ on advertising on my site. The cache does not show my ads.

(I will ignore the irony, for the moment, of the fact that it's Google Adsense ads making me money.)

But wait! This is google, right? It's okay if Google archives my site, some people say, because everybody goes to Google to FIND my site.

Well, there's a really scary precedent here and that is that you have to enter a code into your HTML telling Google NOT to archive it. These codes are not standardized. There's no law saying that Joe Blow's Search Engine needs to heed a "No Cache" code. Joe Blow's Search Engine could decide that their standard is a tag that says "No Robots" or "No Steal" or "Do-Not-Archive-On-Joe-Blow's-Search-Engine-Ever-Under-Any-Circumstances!" repeated after every 100 characters, and only valid on files ending in "s" ... you get the idea.

Say Joe Blow decides he wants to do a Web site on Red Widgets. Based on this law, he could set his Web site up with a note somewhere in 2 point type, in lime green on yellow, in an obscure corner of the site, telling you how to opt out of being included. And it's complicated and difficult and a total pain in the butt and you'd have to KNOW what he's doing to opt out -- and he doesn't have to tell you about it.

If you DON'T opt out, Joe Blow's Red Widget site automatically goes out, copies all your articles on Red Widgets, reposts them to Joe Blow's site. Because Joe Blow does this to as many websites as he can find, he has the biggest repository of articles on Red Widgets, and ends up ranked high in the search engine ratings. And then he plunks advertising down, maybe in frames, surrounding the archived pages that YOU wrote and he makes $$$. And maybe $$$$. (Adsense really does pay well.)

There's also a second issue here, and that's first pub rights.

From what I've heard, a lot of publishers look the other way regarding first pub rights if the story's been put in a passworded forum that's only open to members, for critique purposes. However, will they look the other way if it's already pubbed on Google? Hypothetically, you could put a story somewhere on the internet that you think is safe and Google could come along and stick it on their site.

A javascript based password will not keep Google (or most search engines) from crawling a site as javascript. CGI or SSH-based passwords probably will, until the day the password's not working right.

Plus, even if you don't sell the story -- well, I can tell you, much to my chagrin, that some stories I wrote more than ten years ago are floating around on the 'net. They're horribly bad, and I wish they'd just go away, but Google's got them cached and anyone who searches for my real name (like an employer -- it's funny the things you don't think of when you're a teenager!) will find these really bad, really embarassing stories.

But the bottom line here is that it just became a whole heck of a lot easier for scraper sites to steal your copyrighted work -- and do so legally in the US. And then they can make money off YOUR work and you'll never see a penny of it.

I'm actually not all that worried about Google's archive which. I'm very worried about the proverbial Joe Blow's Blue WIdgets websites, that may take your articles under the guise of, "They didn't say we couldn't!" and make money off them. This is a real possibility. They already do this illegally on a regular basis. And now it just takes a bit of programming to make it all legal.

Leva

Richard · Jan 26, 2006

I very sincerly hope this ruling is appealed and overturned. It's NOT a writer friendly ruling. Basically what the court ruled was that to prevent copyright theft, you have to actually tell the party, using THEIR rules, not to take the article and republish it.

I'm very worried about the proverbial Joe Blow's Blue WIdgets websites, that may take your articles under the guise of, "They didn't say we couldn't!" and make money off them

You misunderstand the situation here. Robots.txt isn't anything to do with Google's rules. It's the accepted internet standard for telling search engines what they can and can't do, which is a big reason why they won this case, and couldn't just walk away scot-free over Google Print. They're going to have the information either way, and having that cache available is generally a service - it makes information available if your site goes down, or gets slashdotted. It's also really, really easy to get stuff off Google if you own the site: http://www.google.com/webmasters/remove.html

Had it been 'tell us if you don't want it', there'd have been a much, much stronger case for the prosecution, and in your example with Joe Blow, you've still got every bit as much legal power over copyright infringers (note: not theft, infringement) that you had yesterday. You're in a much better position to prove harm over such reuse, the information is provided in a very, very different way to the Google cache when making your case - this ruling means nothing on any relevant level in the situations you state. It's the difference between a library taking a photocopy of a piece of information, and a publisher publishing a ripped off anthology - if even the courts are saying something's fair use, the world's not going to end.

As for your stories, you'd find them anyway. Google has to keep copies of pages to build it's database, and it's going to keep producing the links on demand - broken or otherwise - until either you or its spiders remove them. There's no other way such a service can work these days.

Far more relevant to this type of discussion are things like the Internet Archive, or Furl, or a million other sites that will take copies of your site as they were at a particular time, and build that in as a core part of their service. Not that I have a problem with them, but these are far more relevant if you want to fight the copyright issue. Google zaps links when it discovers they're dead - the idea is to offer a last-ditch attempt to get to a site that isn't responding rather than to mirror the whole web - while with these ones, the data's there for good.

Deleted member 42 · Jan 26, 2006

What's more, if you send a letter via U.S. mail to Google, providing the data they need, they will remove your copyright material from their cache.

I think this is a non issue.

Richard · Jan 26, 2006

This is a total non-issue.

Leva · Jan 26, 2006

Given the fact that I have appropriately set up "no cache" tags on my sites for forums and chatrooms and have had them cached in various search engines anyway, it IS an issue. The tags are not standardized. And THAT is the issue here -- there's no requirement for standardization at all, and there is no industry standard.

Plus, what if you simply have a typo? You won't know until the damage is done.

*shrug* My problem isn't what google is doing. It's the precedent here that makes scraper sites hypothetically legal. It's bad enough when some idiot takes your work, steals it outright, puts THEIR adsense code on it, and makes money off your work. Now, based on this ruling, it could be legal.

Leva

Richard · Jan 26, 2006

It's bad enough when some idiot takes your work, steals it outright, puts THEIR adsense code on it, and makes money off your work. Now, based on this ruling, it could be legal.

User-agent: * is pretty simple, if that's really what you want - if a site or search engine doesn't respect that, there's nothing in this case that stops you suing them over the fact. That's no different to yesterday. Heck, you're possibly even slightly stronger today by knowing that that's going to be a major feature in the argument if you do have problems with them, rather than risking losing a major lawsuit over the matter.

And there is no connection in any write-up I've seen on this between Google getting a pass on fair-use grounds, and the commercial exploitation of work via scraper sites.

Cathy C · Jan 26, 2006

Something I keep seeing from cases like this is what people want the best of both worlds. They want their site to be public, and so -- for information they place on it to be viewable, but only on certain terms and with certain conditions. That's not the way it works. The public is one big papparazzi and Google is only the camera. Be careful what you do in front of the cameras, because the shutter is always clicking.

DaveKuzminski · Jan 26, 2006

There are already precedents and laws covering cameras that basically state in general that you can limit the use of cameras to take a picture of your artwork if it's not out in public view. In other words, if it's in your art museum where people can come to view it, you can dictate no cameras and the law is on your side. If it's outside the museum in full public view then there's almost no way you can prohibit cameras being used. That, I believe is what this is boiling down to. You basically have to enter someone's web site in order to take that picture. While the public is welcome like in many free museums, you still have the right to restrict the use of cameras or web shots as the case would be.

However, this is where I differ with that court's reasoning. They're assuming that everyone with a web page knows how to put together a proper sign restricting web shots when the situation is quite the contrary. Since that knowledge isn't as widely known, the burden should be upon the visitor to ask permission rather than just take shots of everything they're capable of entering. In other words, the law shouldn't favor only those who know the law and how to put together a proper HTML coded "no copying" sign. The law is supposed to protect everyone in an equal manner. When it doesn't, it's wrong.

HapiSofi · Jan 26, 2006

Non-issue, Dave.

Leva · Jan 26, 2006

Plus, based on the way the law is being interpreted, you have to use THEIR standards to say "no copying."

To extend the analogy with a museum, you now have to put a sign up that says, "No copying" in <i>every language in the world</i>. If the person doesn't know how to read the languages you've posted, then they can willy-nilly take all the photos they want.

I don't mind opting out of having certain pages cached. I really don't. I also don't mind having my pages cached, in the majority of cases. Heck, google's cache saved my butt the other morning when I saved a file over another file on my site. But I like having a choice. And there are certain things I really, really do NOT want cached -- like forum comments, or pages that are 100% unique to my site. If I write it myself, or if I pay for exclusive rights (and my site will start paying for material in April) by golly, I want that material to be exclusive to my site -- I don't want people finding the same material I paid for on a scraper site.

The problem is that there is NO STANDARD. The current standards can and do vary. And the onus is on the writer to learn every single standard out there and apply every single standard to opt out.

And really, since when did you have to "opt out" to protect your copyright?

Rest assured, there ARE going to be sites that take total advantage of this ruling by providing a way to opt out in a non-standard format -- but not bothering to tell their victims about it. I do not have the time nor money to actually sue them and, again, that uses up time that's valuable. Having spent a bit of time recently dealing with some sites that swiped articles only hours old, I know the mentality there. However, if the court ruling sez they can legally cache my work (and then potentially profit from it) that takes away the "easy" way to deal with a recalcitrant site owner who's swiped stuff, which is to contact their ISP. ISPs may or may not be willing to boot a paying customer if the customer's not doing anything technically illegal.

I guess my major objection to this -- and judging from what I'm seeing on some of the major web publisher sites, I'm not the only one with my feathers in a ruffle -- is that you have to opt out. Google isn't the issue, it's the rather predatory underworld of the internet -- there already are a large number of web publishers out there who don't give a hoot about copyright; now they have a legal loophole they can use to take full advantage of MY hard work.

Leva

DaveKuzminski said:
There are already precedents and laws covering cameras that basically state in general that you can limit the use of cameras to take a picture of your artwork if it's not out in public view. In other words, if it's in your art museum where people can come to view it, you can dictate no cameras and the law is on your side. If it's outside the museum in full public view then there's almost no way you can prohibit cameras being used. That, I believe is what this is boiling down to. You basically have to enter someone's web site in order to take that picture. While the public is welcome like in many free museums, you still have the right to restrict the use of cameras or web shots as the case would be.

However, this is where I differ with that court's reasoning. They're assuming that everyone with a web page knows how to put together a proper sign restricting web shots when the situation is quite the contrary. Since that knowledge isn't as widely known, the burden should be upon the visitor to ask permission rather than just take shots of everything they're capable of entering. In other words, the law shouldn't favor only those who know the law and how to put together a proper HTML coded "no copying" sign. The law is supposed to protect everyone in an equal manner. When it doesn't, it's wrong.

HapiSofi · Jan 26, 2006

Authors are forever fretting about their storylines being stolen, their work being electronically pirated, et cetera at great length.

Their anxiety is misdirected. With a very, very few exceptions, the great enemy of writers isn't theft. It's obscurity.

Mac H. · Jan 26, 2006

DaveKuzminski said:
However, this is where I differ with that court's reasoning. They're assuming that everyone with a web page knows how to put together a proper sign restricting web shots when the situation is quite the contrary.

You are wrong here. The court is not assuming that EVERYONE knows. The court is not assuming ANYTHING about anyone else. The court was ruling on this ONE case - not about what others may or may not know.

In this case:
1. He deliberately edited the robot.txt file to invite google in.
2. He KNEW he could ask google not to cache, but deliberately chose not to.
3. The ENTIRE THING was a SETUP so he could sue Google. He WANTED them to copy it so he could sue them.

That said, I still find it odd. The law says that you MUST ASK PERMISSION before copying - it simply doesn't allow you to have an opt-out scheme.

Jaws has a nice writeup on his site on this case.

Mac.

Leva · Jan 26, 2006

In the print world, this is true.

IN a web publishing world, page rank is a large portion of what drives traffic to your site. Say you've got a site on blue widgets. Ye go to Google, and type in "Blue Widgets" and Google then serves a list of sites about Blue Widgets.

What Google & the other engines ALSO do is rank sites based on what it perceives to be their relevance. And original content, lots of original content, that is not duplicated elsewhere, helps your ranking. There are other things, but having a site that has tons of content nobody else has is one of the #1 ways to end up near the beginning of the list.

And the higher your rankings, the more people find your site. Which translates directly to money if you're either selling a product or making moola via ppc (pay per click) advertising.

In a context like this, having your content duplicated directly impacts YOUR rankings and the money in your pocket. Which is why people who are making money off web publishing get po'd and fuss a bit when a precedent is set like this.

And yeah, the guy probably set google up by not setting his robots.txt up properly. More or less, Google obeys the robots.txt if you follow their standards. The point is, the court has set a precedent that the bad guys can now use as a legal loophole to legally steal content without compensating copyright owners -- an action that can have a direct effect on publisher's pocketbooks.

Leva

HapiSofi said:
Authors are forever fretting about their storylines being stolen, their work being electronically pirated, et cetera at great length.

Their anxiety is misdirected. With a very, very few exceptions, the great enemy of writers isn't theft. It's obscurity.

Deleted member 42 · Jan 26, 2006

Leva said:
In a context like this, having your content duplicated directly impacts YOUR rankings and the money in your pocket. Which is why people who are making money off web publishing get po'd and fuss a bit when a precedent is set like this.

Leva

Do you know anything about the way the Web works or about copyright law?

Because you really don't seem to.

All you have to do is tell the site that you want your content removed. They have to comply, or they face huge punitive fees. The DMCA, one of the most poorly conceived copyright laws ever conceived anywhere, is very explicit about the notification, the steps to be taken, and the time period. If the site won't comply, I guarantee their upstream ISP will. The law indicates that the content must be removed on proper notification, and questions asked later. The plaintiff actually deliberately modified his robots.txt file, in order to be able to sue. He never even asked Google to remove the cache--he simply sued. Google removes stuff all the time--it's not a big deal.

This case hasn't changed anything, at all. It was settled entirely on the basis of previous case law.

It's a total non issue.

Read Jaws; he's a Web master, and an attorney with disgusting amounts of experience with copyright law--on and off the Web.

http://scrivenerserror.blogspot.com/2006/01/less-than-it-seems.html

Aconite · Jan 27, 2006

Leva said:
And yeah, the guy probably set google up by not setting his robots.txt up properly.

He deliberately set up the situation to produce just this result so he could sue. He admitted this.

Jaws · Jan 27, 2006

The Field decision means a lot less than just about everyone claims it does. I've commented on it on my blawg, for those who want to work their way through:
"Less Than It Seems" (26 Jan 06)

The short version is:
The court didn't like the way the plaintiff tried to manipulate Google and the court, and found for Google on those grounds. (Then there was the gross incompetence in drafting the complaintit didn't even allege the right cause of action, and Field himself is a lawyer!) The comments on "fair use" are at most side comments, because they are not necessary to reach the judgment actually issued by the court. Finally, even if the court had found an infringement (which it did not on these facts), the court held that the DMCA would have shielded Google from liabilityso long as Google took the material out of its cache upon notification, which it in fact did.

HapiSofi · Jan 28, 2006

Jaws, this puts me in mind of Rowling vs. Stouffer. I wouldn't have thought it was all that common for would-be blackmailers to bring suit for copyright infringement. Does this happen very often?

Sheryl Nantus · Jan 28, 2006

maybe I'm just too full of cheap Chinese food, but why not just AVOID putting up stuff on the net that you don't want to be seen by all and sundry?

don't post your short stories, etc. unless you're prepared to have them open to everyone and everything...

oh, look... fortune cookie!

*wanders off*

Leva · Jan 28, 2006

The issue here isn't not wanting people to read my material. It's the "how to you make money online" question.

For many writers, it's by designing a Web site with good content and then putting ppc advertising on it. This can be amazingly lucrative if you've made a good site.

When someone COPIES an article off a for-profit website, it hurts the web publisher's bottom line several ways:

1. By directly providing competition. If someone googles for information on blue widgets and someone's stolen an article -- the person has two choices now. They may go to the "scraper" site that swiped the article, and click on the scraper site's ads, and the writer doesn't get paid for that click.

2. By hurting page rank, as explained above. If your content is stolen multiple times, your pagerank can drop substantially.

3. Indirectly, it cheapens the information value of your site. If you write a story about blue widgets and the article gets posted on a hundred different sites, even your regular visitors might not come by as often -- because they know, consciously or subconsciously, they can go to any number of other sites and find identical articles.

Which, if you're trying to make money off a website, is why you defend your material vigorously.

I hope this ruling regarding site-caching will be a non-issue. But I have a sneaking suspicion that the bad guys will TRY to take advantage of it, even if they end up losing some lawsuits down the road. (For the real slime out there operating scraper sites, being shut down is just part of doing business. They steal content from thousands of sites and -- well, if you figure they've stolen a thousand sites, as a hypothetical number, and make a buck a day off each site's worth of stolen material ... there's real money in it.)

Leva

Sheryl Nantus said:
maybe I'm just too full of cheap Chinese food, but why not just AVOID putting up stuff on the net that you don't want to be seen by all and sundry?

don't post your short stories, etc. unless you're prepared to have them open to everyone and everything...

oh, look... fortune cookie!

*wanders off*

veinglory · Jan 28, 2006

But this is just a cache not a competing site. I don't see how it will have any significant effect as the internet currently works.

roach · Jan 28, 2006

As has been repeated here several times, if someone takes your content and hosts it on a competing site you still have recourse. You go after them for copyright infringement. This case does not affect that in any way.

Mac H. · Jan 28, 2006

Jaws said:
Finally, even if the court had found an infringement (which it did not on these facts), the court held that the DMCA would have shielded Google from liability—so long as Google took the material out of its cache upon notification, which it in fact did.

Jaws, this is the bit I don't quite understand. I always understood the DMCA covered ISPs who are hosting unlicensed copyright material. So if I copy something illegally and put it on my website, the DMCA protects my ISP, but not me.

In this case, surely the DMCA would have protected them from damages due to them HOSTING the cached material, but not from the original act of COPYING it in the first place? As you point out, the copyright laws don't allow an opt-out scheme for asking permission.

Again, this is an area which I know very little - I'm just curious as to how it works.

Mac.

James D. Macdonald · Jan 28, 2006

I'm trying to figure out why this thread is one Bewares & Background Check.

victoriastrauss · Jan 28, 2006

HapiSofi said:
With a very, very few exceptions, the great enemy of writers isn't theft. It's obscurity.

Tell me about it.

- Victoria

Leva · Jan 28, 2006

You do have recourse. But by the time you find the information, the damage may have already been done to your rank.

The courts will probably determine if this is a valid precedent or not. Jaws may be right, and it's not. But that's not going to stop the bad guys from testing it -- from setting up an "opt out" situation with an automated 'bot that swipes your copy and reposts it. Given the number of web publishers who believe this IS a precedent, there have to be quite a few bad guys out there who are also jumping up and down and cheering and thinking they've found a legal loophole for swiping content.

("caching" can quite easily become profitable. Just set up a search engine that displays the "cached" content along with the site's own ads.)

And unfortunately, while you certainly have legal recourse -- it's a major pain in the butt and a potential financial drain.

Leva

roach said:
As has been repeated here several times, if someone takes your content and hosts it on a competing site you still have recourse. You go after them for copyright infringement. This case does not affect that in any way.

Court ruling regarding google & copyright infringment -- important!

13th Triskaidekaphobe

Deleted member 42

13th Triskaidekaphobe

13th Triskaidekaphobe

Ooo! Shiny new cover!

Preditors & Editors

Hagiographically Advantaged

Hagiographically Advantaged

Board Visitor

Deleted member 42

Apex Predator

Hagiographically Advantaged

Holding out for a Superhero...

volitare nequeo

annoyed and annoying

Board Visitor

Your Genial Uncle

Writer Beware Goddess