Aaaarrrrggghhhhlgorithms

It’s already a well-worn trope that algorithms are good for some things (processing huge amounts of data) and bad at others (anything to do with human interaction) and yet Instagram has now joined Facebook, Twitter and Uncle Tom Cobleigh in rolling out an algorithm that purports to display the ‘most important’ things in your feed.

In short: stop it.

algorithm

In long: It began – as many things do – with Google. Google were the first people to do a good job of automating the process of crawling and ranking websites in response to queries.

Prior to that, things like Yahoo, DMOZ, Best of the Web etc used human eyeballs to judge the quality of sites and pop them into categories. And guess what? Humans are both imperfect and corruptible and trying to put entire websites into one category is often impossible (hence my long-running belief that Schema is a backwards step). I can still just about recall the days when getting a site onto DMOZ was phase 1 of an SEO campaign, and meant trying to find someone who either accepted anything that was put in front of them, or someone who would accept anything that was put in front of them alongside a brown envelope with some cash in it.

So Google’s programmers wrote an algorithm. It followed every link to see where it led, and added that place to its index. Then it calculated the importance of pages based on the number of links, and the rest you know (i.e. their swift rise to total dominance made the internet accessible to all, and also hopelessly corrupted its very nature, turning everything into a commercial shitstorm and an entire economy based on the whims of this algorithm).

So. Algorithms have a place. It would be an act of absolute folly to try to replicate what Google does with humans. Google still pay humans to do a bunch of testing, but RankBrain is the first tolling of the bell for those guys.

Google search engineers, who spend their days crafting the algorithms that underpin the search software, were asked to eyeball some pages and guess which they thought Google’s search engine technology would rank on top. While the humans guessed correctly 70 percent of the time, RankBrain had an 80 percent success rate.

As most of the commercial web hands over usage data to Google through Analytics, and drives traffic to their web pages via AdWords or other Google properties, so mass data tools can be used to supplant human imperfections. If a site has a high bounce rate for a particular query, Google might fairly surmise that that site is actually not suited to that query and start to drop it down the rankings. Finding a replacement for ‘linkjuice’ has probably been Google’s top priority for years now, and each turn of the ratchet brings the end game closer.

Naturally, Google having set this tone means that every company wants to have an algorithm, BECAUSE ALGORITHMS. But it’s not always clear who these algorithms are meant to serve, or to what end outside the very specific needs of Google.

Twitter and Instagram are two big brands that have recently rolled out algorithms to their products that actually serve to defeat their own very nature.

An example: I follow a bunch of people on Twitter – from friends with a handful of followers to big accounts who tweet (seemingly 24/7 – get a life, guys!) about the search industry. Trad Twitter just showed everything in chronological order, which is perfect for the medium. How is any algorithm going to determine relevancy: no one outside Twitter’s engineers knows. While I’m not a programmer any more, I do know that all they’ll be doing though is chugging data. The algorithm will pick tweets for me from the people I followed based on either one or two things:

  1. Tweets which have had lots of engagement (I wouldn’t know about that: my stats are lousy)
  2. Tweets from people I most commonly interact with (which is about three people).

And why are they doing this?

Ostensibly to ‘improve’ Twitter so it behaves more like Facebook and thus can attract the idiots who populate that horrid corner of the internet. In reality, we all know that they’re ramping it up so they have a further means to shove advertising in. At first it will be indirect (“this tweet from Celebrity X was amazingly popular”) but then as Twitter’s finances continue to get worse they’ll just use it to sell another slot to advertisers until eventually they give up and sell to Yahoo! for them to hammer the final nails into its coffin.

And so it will come to pass with Instagram, Snapchat and whatever-the-fuck the “next big social media site” is. (hint: not Google+)

And why is this a bad idea?

There are some people I follow who I never engage with and who don’t have big follower counts, but whose content sparks trains of thought that otherwise might not cross my mind. Someone like @TheWarNerd tweets infrequently and has relatively (in the scheme of things) a low follower count of under 10,000 – but I wouldn’t want to miss a single tweet.

You know how this plays out without me having to type it. Shorn of metrics to analyse the way I follow him, Twitter will probably conclude that he is ‘less important’ and shove his tweets down the line. Instagram will do exactly the same thing and it will suck for exactly the same reasons.

Why does this keep happening?

I have a bit of a theory about why big tech companies start to “improve” their products until such a point that they become an unusable mess and die. It’s because of the typical life cycle of a platform and the peril of having a team of immensely talented idiots around the place.

  1. Great idea
  2. Great PR begets exposure
  3. Exposure begets big investment from someone
  4. Big investment means recruiting a whole bunch of Really Smart Guys to help get things scaled up
  5. A whole bunch of Really Smart Guys begets boredom when the scaling has been done and the name of the game is administration
  6. Boredom begets dwindling PR exposure
  7. Dwindling PR exposure begets the need to announce things
  8. The need to announce things begets asking the Really Smart Guys to think of ways to ‘improve’ things.
  9. Really Smart Guys’ improvements begets disenchantment because Really Smart Guys don’t know shit about how humans work
  10. Disenchantment begets falling user numbers
  11. Falling user numbers mean Microsoft or Yahoo! buys them – partly to get the Really Smart Guys and partly because they can’t think of their own ideas
  12. Someone realises it’s all been a huge waste of money
  13. It shuts

I don’t even know why I’m writing this, or where it’s going – other than it’s been almost a year since I wrote anything on my blog and one simply must keep up with these things so no one sees behind your carefully constructed facade to find the jaded 40-something web manager within.

Anyway, the point still stands: take yourself and your stupid algorithms and get in the bin.

 

Google in the mobile ecosphere

Mobile is a problem for Google. It’s a paradigm shift that few saw coming just 5 or 6 years ago, but the launch of the pocket web has begun to completely reshape our experience of the internet, which sites we interact with, and how we organise our spending. There are two reasons this creates problems for Google.

Less space for ads + different user XP

Here is a current screen cap of a typical Google search for “used ford kuga”

kuga

I’ve highlight the ads in yellow. Of the 10 available positions on the page, no fewer than 8 of those are paid ads. Given this experience, there is plenty of choice for the consumer even without scrolling down. In the desktop-only world in which Google’s ad model was conceived, this is wonderful – as you have many advertisers, all trying to appear in those top 8 slots, plus another 3 or 4 willing to appear further down the page.

On mobile the story is different. Here is the same result – again with the ads highlighted in yellow.

mobile1

As you can see, the amount of screen space given over to ads is similar – with only one organic result. But that screen space includes just 2 ads.

That means greater competition for those two spaces, which in theory means higher CPCs for those wishing to compete in that space.

But, from a user perspective – and assuming we value choice – there is very little utility there. If I want to see diversity, I scroll. And here’s the rub: we DO scroll with our thumbs.

The old rubrick about organic listings in particular was that most of the traffic went to the top 3 sites and that anywhere further down the list was almost invisible (I exaggerate a little). But that was driven in part by the mouse-point-click metaphor that belonged to the desktop. With a scrolling interface, we can are accustomed to the easy flick of a thumb to see more: hell – the interface demands it.

Thus, on mobile, it is likely that fewer people will actually click the ads as they will assess what they see and move down the page.

In short, whereas equity on desktop is split across potentially 12 different ads, the opportunity for Google on mobile is less. Even if you include the bottom of the page that’s still just 3 additional slots.

mobile2

5 slots instead of 10-12 means fewer opportunities for clicks. All things being equal, Google would have to see double the CPC or CTR from these ads to generate the same revenue from a mobile search as from a desktop search. And here we hit the wall of reality: in most markets, vendors are selling the same product with the same costs and the same margins. Investors might have been impressed by the stunning growth of the internet and Google’s revenues, but very real limits exist driven by real-world costs.

If I can buy Blue Widgets at £5 then so can my competitors. Having first-mover advantage on the internet might mean a window where I can buy clicks for 10p, sell the widget for £10 and thus make a profit according to my conversion rate. That doesn’t last, however. As more people come in, the market matures, margins narrow and thus the money available to spend on clicks declines. According to economic theory, the marginal profits on fungible goods are effectively zero. No wonder then that Google’s CPCs have been in decline for some time.

This probably is a peek behind the curtains as to the resurgence of brand building and display – none of which favours Google.

Apps are… better

Compounding Google’s problem on mobile is the very core of the mobile experience: the app.

I’ve opined before how a horizontal search engine such as Google is actually pretty clumsy when it comes to vertical searches like holidays, clothes shopping etc (other opinions are available)(other opinions are available). I’ve booked a couple of dozen hotels over the last year or two and the number of times I’ve used Google as part of that process? Zero*.

There’s always a danger of reading too much into your own personal experience (after all: I’m quite experienced and savvy these days) but specific apps just seem to make so much more sense than meta engines.

If you were actually looking to buy a Ford Kuga as per my example, downloading the AutoTrader app would make for a whole better experience than clicking through 5 or 6 different websites while trying to learn their varying internal logics and navigational methods.

In conclusion…

Google are still a money printing machine – even on mobile. That isn’t going to change any time soon and any advertiser who can afford to, has to be in the game. The day that the mobile web is the web is already here, and Google recent ‘mobilegeddon’ update is tacit acknowledgement of that fact.

In display, Google rule the roost – with the world’s most popular video channel, largest display network, and other native advertising tools for marketers to take advantage of: all of which yield good results if handled properly.


 

*Actually, that’s a small lie. A small example: when I stayed in Glasgow recently, I used Google to find out where the hotel was in relationship to various things I wanted to see and visit but the important bit – the transaction – was carried out through the booking.com app. What Google couldn’t do was monetise me.

No: Google doesn’t index your meta description

I was asked recently whether Google actually indexed your meta description and I was about to say “of course!” when I had one of those rare flashes of caution and decided to check.

Our company’s home page has the natty meta description of:

“Trusted Dealers is THE SAFEST place to find and buy a second hand car online, with 10 points of difference to ensure you are happy with your deal.”

I know, I know – it could probably do with a refresh. Anyway, when you search Google for ‘trusted dealers’ it is this that displays in the SERPs.

meta1

Perfect, eh? Yet when you use the site: command to check whether it has been indexed, Google wags its finger and says no.

meta3

So that’s that cleared up and if anyone asks you, you can tell them ‘no’ and that I told you so.

But! The phrase does appear all over the internet. Remove the site: command from the query and no fewer than 129,000 matches are returned. Such as…. meta2

This happened because our blog was hacked for a brief while and a funnel page to a network of gambling affiliates was placed on the site. Someone then built a few hundred thousands links to this page – presumably using XRumer – and these links remain floating round the internet (incidentally: our massively expensive SEO agency didn’t notice this – I did, through a desultory check via ahrefs.com).

I don’t suppose there are many lessons from this except:

  • Don’t rely on an SEO agency to do everything for you
  • Update your WordPress often
  • Despite Google’s claims that it expunges bad content and links from its index, that’s clearly nonsense: three months after the hack, all those XRumer-built links and hacked blog posts, connected to an empty affiliate scheme in the gambling sector remain in Google’s index.

I don’t imagine there’s much margin any more in doing this sort of thing, but if people are still making something of a living by auto-creating hundreds of thousands of pages at once and hacking WordPress, then I guess my musing the other week about the status of black hat SEO might be out of date in itself.

If you have the energy, you can probably do something with this collection of bits and bats. Sadly, I don’t.

Google’s Eccentric Choice of Review Partners

Google have proselytised about the value of reviews for many a long year (in internet terms, at least). Their own reviews system – like many of the company’s second-tier offerings – has never really garnered traction. Doubtless, this is partly because it is yoked to Google Plus / Google Profiles, but also because lots of other players in the space actually have better systems. As an example of this, one only has to check out the Google ‘reviews’ for Guantanamo Bay or Broadmoor Hospital.

broadmoor

While these might be hilarious, it doesn’t really indicate that Google is promoting or policing its product properly.

But what should be of more concern Google’s product managers in this space is the drawing in of review scores from other providers (hint: go look for a new job!).

Typically, the big one box/card for a company will highlight Google Reviews, but also draw from other sources around the web. This mirrors Google’s recent, Hummingbird-led forays into injecting information directly into the results rather than encouraging clicks to source data. This is very much a topic for another day.

In the meantime, however, just look how seriously Google are taking things. A search for “Chiswick Honda” has these results:

chiswick1

Reviews from… webcompanyinfo.com? websitepic.com? Both of these sites are actually just the kind of thin-content SEO shill that offer some crappy “seo data” for webmasters. Clicking the links from that one box don’t even take you to reviews! Just this kind of crap.

chiswick2

It isn’t hard for Google to determine who offer real reviews from real people – be it reevo, bazaarvoice, feefo etc – so why are they giving airtime to operators like this?

The answer lies, as ever, in monetisation and a power-play that Google is engaged with against review sites. As the day draws short, I will leave that for another post, but if you take anything away from this let it be this: Google’s concern for ‘quality’ is often skin deep where that ‘quality’ poses even a minor threat to its own model.

More anon.

Tabs, “Hidden Content” and Google

Tabs are a handy, universally understood visual metaphor that have been used for many years by designers to make manageable and usable pages. There has always been a small degree of confusion about whether or not Google treated tabbed content as ‘hidden’ content and whether or not they would penalise sites using tabs.

Following a post by John Mueller, it seems that Google have come down against tabs. They believe that tabs are a way for people to show one thing to users and another to Google.

To an extent, that’s true: it would be easy to make a short, punchy “selling” article that is seen by visitors, while hiding a whole bunch of keyword-heavy text behind a tab. Whether that’s good or bad practice is something of a religious question.

Personally, I’ve always felt – and still do – that tabs are a good way to visually organise things on a page. Here’s how I use them on my hobby site:

tabs

Now I don’t see anything inherently wrong with this. I can make a super-useful page, packed with content but organised in such a way to be navigable without a 4 mile long page.

But I think Google and I disagree on this. Recent uses of the site: command in Google have revealed that the main ‘hub’ pages for any topic have been downgraded recently in Google. Searching site:weirdisland.co.uk “yorkshire ripper” did not place the relevant page at the top, despite a reasonably solid internal link structure. Instead, the target page was under pretty much every other page on the topic.

This made me sniff around the page to see what the problem could be. The main suspect? A ‘timeline’ tab. This tab included data from all the related articles – dates and locations, all Schemafied and presented in a nice fashion. I couldn’t see any real fault with that, but looking at it again from what Google have been saying, this tab actually had more information and a higher word count than the main article itself.

tabs2

In my eyes, I had done a pretty nice job of balancing visual presentation and information, but I suspect this sort of thing is the kind of trigger for Google to downgrade a page.

As such, I’ve separated these timelines into standalone pages like this.

I feel ambivalent about this. I feel that I’ve been bullied into changing my site to fulfil an algorithmic diktat from Google that implies that my design was an attempt to trick their bots. Part of me thinks I should stand my ground and not change a thing.

However, as part of the remit I’ve given myself with that site is to use it as a testbed for such things, I’ve caved to see what happens from a Google perspective.

I will, of course, let you know what happens.

Spoofed Referral Traffic in Google Analytics

The contined spoofing of referral traffic in Analytics highlights a couple of things:

  • Shortcomings in one of Google’s flagship products
  • The shift away from old-skool SEO for spammers to more subtle ways of gaining traffic

My hobby site (weirdisland.co.uk. Go visit it now. Please) – even with its paltry visitor numbers (just shy of a couple of hundred per day) gets a small but noticeable trickle of traffic from fake sources such as:

  • Semalt.com
  • buttons-for-website.com
  • ilovevitaly.com
  • swagbucks.com
  • priceg.com
  • darodar.com

These are covered in good detail over at Refugeek and by Dave Buesing (both sources have some good tips for removing these sites from appearing in Analytics if you want clean, realistic visitor numbers).

The basic method relies on the fact that Analytics can be spoofed – tricking the unwary visitor into thinking they are getting actual human visitors from sources. In fact, these are just faked visits by bots posing as browsers and passing through false headers.

The motivation seems to be (as far as I can tell) to get site visitors to visit these sites to see where their link is. Personal example: I started getting traffic from Semalt.com and visited their site to see where/how/why they were linking to me. I couldn’t find anything, but noticed that they had some on-the-face-of-things useful SEO tools. I signed up for a ‘free account’ and then promptly forgot all about them, but they still send me emails asking if I want to upgrade to their pro package.

It’s a cunning sleight of hand when you look at it this way. In an easily scalable way, they can effectively drive reasonable levels of traffic to their site by bringing themselves to the attention of anyone with Google Analytics installed. Once those people are on semalt.com, the bait and switch takes place, and a certain number of people will thus sign up to their product. I imagine it’s probably profitable.

That’s obviously deceitful practice, but highlights how the nature of scamming has changed. As Google has made it harder and harder to spam the SERPs, so innovators/black hats (delete as per your prejudice) are looking for new routes.

huffpo

A current fake referrer to my site disguises itself as Huffington Post. At first, I was briefly excited – perhaps I’d got a link from HuffPo! In fact, the referral itself was spoofed: the Huffington Post link – when clicked in Analytics – actually redirected to some Chinese shopping site, presumably dropping some affiliate cookies along the way to capture revenue from me should I ever do any shopping on Aliexpress.com (which is where the link actually redirected).

Update: on closer inspection, I’ve noticed that the URL is actually “hulfingtonpost.com”, which also explains how the redirect works.

It’s cunning stuff, to be sure, but I find it hard to believe that it’s a sustainable or large enough niche for anyone to make more than a few quid from. As I mentioned a couple of posts ago, it adds to my belief that black hat/affiliate sites are finally being shuttered by Google and the glory days of such operations are now behind us.

As such, we should actually tip a hat to Google in thanks. For many years, spammers and scammers tried – and succeeded – in keeping the SERPs cluttered with affiliate links dressed as content. Google announced their intention to do away with this years ago and now – if you want to go down that route – you have to go big on site quality and content. Of course, the high price of doing that makes most affiliate programs unsustainable because building the necessary traffic levels can’t simply be left to content spinning and xrumer any more.

Good.

Keyword Data back in Analytics

One of trad SEO’s biggest gripes for the last couple of years is the obscuring of keyword data in Analytics. Of course, much of that data has actually been available in Webmasters Tools for quite a while now

gwt

Until today (so far as I’ve seen – it’s probably been rolled out all over the place in stages) the nearest equivalent data in Analytics was found under the Acquisition > Keywords > Organic screen.

But now? That’s gone, and the data from GWT is showing up in Analytics

analytics

This is a nice move, as it puts back a little context into the job rather than educated guesswork based on landing page URLs. It still means a bit of legwork if you want to do detailed analysis but for most SEO purposes it is a long-overdue move. The only critical issue with this is that assuming it follows the pattern used in Webmasters Tools the data will only be available for the last 90 days, and won’t include the last 2 days – which will obviously cause some limitations in analysis.