February 13, 2012

Blogger blog marked as spam

I received an email from The Blogger Team saying one of my sites had been marked as an evil spam blog. The email urged me to request a review if I disputed the classification and went on to say:

“Your blog will be deleted in 20 days if it isn’t reviewed, and your readers will see a warning page during this time. After we receive your request, we’ll review your blog and unlock it within two business days. Once we have reviewed and determined your blog is not spam, the blog will be unlocked and the message in your Blogger dashboard will no longer be displayed. If this blog doesn’t belong to you, you don’t have to do anything, and any other blogs you may have won’t be affected.”

Blog spamThe Blogger Team said they find spam by using an automated classifier.

“Automatic spam detection is inherently fuzzy, and occasionally a blog like yours is flagged incorrectly,” the email says.

“We sincerely apologize for this error. By using this kind of system, however, we can dedicate more storage, bandwidth, and engineering resources to bloggers like you instead of to spammers.”

It’s all a bit strange really.

The help page explains how spam blogs are detected, and I can’t see why mine has been caught in this trap.

They say link spammers “can be recognized by their irrelevant, repetitive, or nonsensical text, along with a large number of links, usually all pointing to a single site”.

An archive of my editorials?

I’ve asked for a review, of course, and it will be interesting to see how long that takes.

Meanwhile, it’s all a little disconcerting.

If the case of mistaken identity isn’t sorted out within a week I’ll shift the site to WordPress or host the content myself.

Update: My site was given the all clear to continue.

Comments

  1. Ebony says:

    Interesting Michael, but the help page explanation is as clear as mud to me…. but this could be because I am as thick as a plank with anything spam, especially geek speak??

    Isn’t that why computers come with the very nice “delete” keyboard magic to use in case of emergencies, and especially for idiots like me when the book of directions is no help, in understanding the book of directions manual, anyway?

    Yeah I know, I am having a girlie dummy spit here…and I thought maths was tough!

  2. delmer
    Twitter:
    says:

    “can be recognized by their irrelevant, repetitive, or nonsensical text…”

    I think I know what the problem is. Australia seems to be full of places with names that, while lyrical in nature and possessing additional poetic qualities, sometimes appear to be a random assemblage of letters just tossed together. Kalgoolrie, Warrnambool, Ballarat and Geelong are recent examples.

    A human being, especially one that comes here often, might say to himself, “‘Warrnambool’ … Is Michael pulling our legs? He’s normally not one to joke about things like this so I’ll have to assume ‘Warrnambool’ is, in fact, a real place.”

    Software, on the other hand, might see “Warrnambool” and, lacking a pre-existing writer/reader relationship, simply toss up an error.

    I’ve nothing in the way of science to back this up with. It’s just a hunch.

    • Michael
      Twitter:
      says:

      You might be onto something there Delmer. Makes more sense than Google’s explanation.

      PS: The warning has now been lifted. I didn’t receive an email, but I no longer have a captcha test when writing in Blogger.

  3. Adam Naiova says:

    The Blogger Team said: “[Spammers]… can be recognized by their irrelevant, repetitive, or nonsensical text …”

    Perhaps The Blogger Team not only runs a blog-hosting application, but offers polite ‘constructive criticism’ as an additional extra?

    No, just joking, no offence intended Michael, LOL!

    It is interesting (albeit with very little practical application) what computer analysis of literature and written publications can tell us.

    I remember watching a documentary on the ABC a year or so ago about Agatha Christie, I think it was called ‘The Agatha Christie Code’ or something like that. Anyway it concentrated to some extent on Christie’s 1926 temporary disappearance, but mainly on a computer analysis of her works.

    Whilst most of the locked-room mystery writers of the era used fairly similar ‘templates’ for their stories, it showed that Christie’s works were unusually linguistically repetitive compared to other writers and that almost all her works shared extremely similar patterns of writing, sentence length, proportions of adjectives to nouns, etc.

    So I wouldn’t worry about being repetitive, you are in the good company of the world’s second (or first, the title is disputed with William Shakespeare) highest-selling fiction writer.

    Although I’m not sure to be honest, that an editor or journalist would want to be compared to a fiction writer?

    • Michael
      Twitter:
      says:

      Journalists and editors are all wannabee fiction writers Adam. We became newspaper hacks because we wanted to be paid for writing and weren’t good enough to succeed at real literature.

      Nice to see I set myself up for a few wisecracks by linking to the Google spam blog definition.

Please comment

*

CommentLuv badge