Tag Archives: spam

abuse Blog blogging comment spam compliment spam Google how-to PageRank plugin scam security SEO social software spyware trust Twitter web-development webspam WordPress

Quick Tip: Keeping Comment Compliment Spam off your Blog

Blogs are great because they give you a creative outlet and let your readers comment on you posts, making it a much more social experience.  But spammers take advantage of comment forms, using scripts and bots to fill the web with links back to their site.

What can you do about it?  Even with captchas, systems like Akismet, and other automatic techniques (you can read more about these here), some spam will slip through.  Specifically, compliment spam.

What is compliment spam? Spammers know you and I like to be told what great writers we are, how helpful our posts are, and that we are brilliant geniuses.  So they set their bots to spam you with complimentary comments that just so happen to link back to their crappy blog, online casino, or fake viagra store.  Here’s an example:

Typolight
http://www.typolight-blog.de | info@typolight-blog.de | 82.146.49.61

Thanks, you nice post that helped me alot.

From Keep your WordPress site from being hacked with automatic upgrades, 2008/09/06 at 9:27 AM

So, at first glance this looks like a legit comment.  The post in question was a “how-to”, so it would be nice to hear that someone found my instructions helpful.  But, do a Google search with the comment in quotes (an exact phrase search) and you’ll see the problem:

http://www.google.com/search?q=%22Thanks%2C+you+nice+post+that+helped+me+alot.%22

At the time of this writing, we see 168 instances of this exact comment.  By this same Typolight person.

So that’s my tip – if a comment seems a bit too randomly complimentary, throw it in quotes and do a Google search. Then, if it’s spam, make sure to spam it – systems like Akismet only work because we’re all reporting spam.

If you really want to go after the spam poster, you can also give their site a bad rating on Web of Trust, StumbleUpon, and other reporting systems.

Maybe if I get some time I’ll throw together a WordPress plugin to make this easy to do.  If you’d like a plugin like this (or have other tips), drop me a comment and it will help motivate me.

How to keep spam off your blog, bulletin board, or forum

Columns of gears in the difference engine Spam, it’s not just for breakfast and email anymore.  Webspam is a huge problem – if you run a blog or a forum, you’re probably familiar with the gobs and gobs of gibberish being posted all over the web by spammers.

This humble blog, which only gets a few hundred visitors per day, has had over 17,000 spam comments since I moved over to WordPress last year.  Having your site inundated with comment spam can be just as big a headache as getting hacked.  No one wants to spend hours every day sorting the good posts from the bad.  I’ve already written about how to totally clear out a spammed forum and erase all traces of it’s reputation-marring existence, but the best solution is prevention.

Here are some steps you can take to help prevent spam on your blog or forum.

Keeping Spam off Your Blog

This section assumes you’re hosting your own blog and can add plugins and make configurartion changes, and my examples will be WordPress-heavy because I’m more familiar with WordPress.

Option 1:  Close or restrict comments. Most blogs give you some options to restrict who can comment on articles.  In WordPress, you can require that users create accounts to comment under Settings -> General.  This might not help too much since I’ve seen hundreds of automated user accounts created right alongside the spam.

You can also require that comments are approved before they appear – in WordPress look under Settings -> Discussion.  This will stop your blog from being graffitied without your knowledge but also requires manual effort.  You can also disallow trackbacks and pingbacks, which are really cool in theory but a major avenue for automated spam.

You can also shut down comments completely, or disable comments on old posts.  At that point you may be throwing the baby out with the bathwater, but it’s certainly effective.

Option 2:  Make sure commenters are real people with a captcha. Even if you’re not familiar with the term, you’re familiar with captchas.  They’re the little widgets at the end of a form where you have to decipher some scrambled text from an image.  Many blogs have captcha options built in, but if you’re looking for a captcha plugin be sure to balance usability with security.

I’ve used the Did You Pass Math plugin with some success.  Jeff Atwood has used an extremely simple captcha for years on his high-traffic blog.  Recaptcha is a really cool project that helps fight automatic posting and digitize old books at the same time.

Option 3:  Use an automatic filtering system. If you’re using WordPress, I have three words for you:  Akismet, Akismet, Akismet! Seriously, Akismet is so good at automatically marking spammy commetns and trackbacks that it’s almost scary.  If you’re not using WordPress, you may still be able to find an Akismet plugin for your blogging platform.  There are other systems worth trying as well such as Spam Karma but I have less experience with those.

Keeping Spam off Your Forum

Again, I’m assuming you are hosting the forum yourself or can otherwise make config changes.  I’ll use phpBB (version 3) as an example because I’ve used it in the past.

Option 1:  Restrict user accounts. This can be a tough call, because when you start a forum you want to make it as easy as possible for people to join in the discussion.  Unfortunately, allowing anyone to register and begin posting without any admin approval also opens the door for spammers.

In phpBB this setting can be found in the Administration Control Panel under Board Configuration -> User Registration Settings.

Option 2:  Again with the captchas. Captchas aren’t 100 percent garanteed to remove spam but they do help.  If your forum software doesn’t have a captcha or a captcha plugin, I would seriously consider upgrading to a version that does or switching forums completely.  I know it’s a huge pain but waking up one morning to find 10,000 spam posts is even worse.

In phpBB3 look under Board Configuration -> User Registration Settings for a setting called “Enable visual confirmation for registrations” and make sure it’s turned on.  You can change the details under Board Configuration -> Visual confirmation settings.

Option 3:  Try to find an automatic filtering system. This is harder than for blogs.  There was an Akismet phpBB mod but it’s apparently not being maintained.  There’s a workaround involving the Spam Words mod that you can read about here.  The Spam Words mod might be worth trying on it’s own too.  Here’s a thread with more options for phpBB2, search around and find what’s available for your forum software.

Even without automated filtering, you can try to slow down the spammers by setting a time limit between posts (most human beings don’t type as quickly as spambots do).  Other options, such as disallowing links and BBCode, are pretty drastic but might make your blog less enticing.

Just for fun:

Spam, spam, bacon, and Spam

The Urge to Deletion: Is Wikipedia is making molehills out of mountains?

Black Mountain Wikipedia is great.  Even now, it’s still kind of amazing that such a huge body of knowledge has been organized ad-hoc by volunteers, most of whom have never met in person. Most social software systems would die for this level of collaboration.

That said, has anyone else gone to a random Wikipedia article from, say, search results and ended up a little depressed?  It seems like every other article I find lately has a big warning label at the top – this article contains too much trivia, this article has too many fictional references for an encyclopedic and academic approach of this topic, and worst one of all: this article has been marked for deletion.

I understand that it must be very difficult to wrangle all the millions of contributions into a consistently high-quality encyclopedia.  Just dealing with all the spam and abuse must be an enormous undertaking, even when distributed among thousands of good samaritans.  But one of the things that was great about Wikipedia was the breadth of coverage and the depth on some particulars, even if it was excessive to the point of comedy.

But a brief look at the list of articles marked for deletion the last few days illustrates my point.

1. Horse Ranch Mountain. You know there’s something wrong when a mountain doesn’t meet the notability requirement.   Here’s the comment opening the deletion on the talk page:

In what way is Horse Ranch Mountain notable? I am quite familiar with the area, and I cannot think of any way in which it is notable. Please convince me otherwise.

I would think it’s notable because it is a mass of millions of tons of rock and earth sticking out of the ground.  One a less sarcastic note, I’m sure I’m not the only one who’s looked at a map, spotted a feature I’ve never heard of, then looked it up online.  Even if it’s not accessible it’s probably helpful to have a reference noting that it’s the highest point in Zion, measured at X meters tall, etc.

2.  List of redundant expressions. I understand the argument that an encyclopedia is not a trivia game or a book of lists, but these sorts of pages used to be one of my favorite features of Wikipedia.  Exhaustive lists of palindromes, English words of Polish origin, etc., give examples, context, and can help connect concepts in language.  Also, the use or omission of redundancy is an important stylistic consideration when writing – it can be used for everything from emphasis to characterization.

3.  Hindu literature. Delete the article on Hindu literature?  Granted, the article needs work.  But isn’t it worrying how the marked for deletion pages are filled with subject matter from outside the U.S. and maybe Europe?

I know the standard answer to complaints like these is that if you feel so strongly, you should participate in the debates and push for things not to be deleted.  Judging by the talk pages I wonder if I would be drowned out by all the “I’m a history major and this is a programming term, never heard of it, not notable” comments.  I’ll admit my contribution to Wikipedia is limited to random spelling and grammar corrections that were obvious enough that even I noticed them, so I could be wrong.  I just feel like some of what made Wikipedia so addictive is slowly being drained away.

Agree?  Think I’m wrong?  Leave me a comment below.  See, it’s kind of like a talk page, but even with consensus you can’t edit my article.   Until the next WordPress exploit comes out.