Tuesday, 13 October 2015

What do Goodreads ratings say about sales?

I have long maintained from my own data that there is a pretty close relationship between the number of Goodreads ratings a book has and its total sales. PROVIDING that the books are of a similar age and the same genre.

Others have disputed me. So I've been out to gather data.

Many authors don't like to talk about sales figures just as many don't like to talk about money. That's a position I respect and understand. I don't share it.

The scatter-plot below is 'number of Goodreads ratings'  along the bottom and 'sales in English' along the side.

Click for detail.



Red circle = 2011 book. Pink square = 2012. Blue diamond = 2013. Green triangle = 2014. Asterix = 2015.

Black circles are anonymous data from 2011+
Black squares anonymous data from before 2011


They are all for fantasy books.

The ratings to sales ratio is pretty stable. If you multiple the number of Goodreads ratings by 7.7 you come pretty close to the number of books sold in English (all formats).

About half of the authors I invited to supply their data decided not to, including all the ones who sell more than I do (Though Peter V Brett has threatened to send me some data points when he returns from his world tour). So this graph may change for the higher numbers. Perhaps if a book breaks out of the genre sales bin it finds a demographic less (or more) likely to rate on Goodreads.

One noticeable outlier (blue diamond in the lower section) has the possible explanation (suggested by the author) that it sold a lot of very cheap e-copies that may be sitting on Kindles waiting to be read. Books on special offer are sometimes snapped up and saved for later. Also, most of his sales were in the UK. Perhaps the British use Goodreads less.


So there you have it. Some evidence that when the book is traditionally published and has a decent number of ratings (say several hundred to make the effects of any manipulation insignificant) you appear to be able to guestimate sales pretty well!

This will of course vary from genre to genre: with the likelihood of the demographic to rate on Goodreads). And with time: books published before Goodreads, or when Goodreads was smaller, will be under-represented by ratings.

Although the ratio will change with genre and age it seems very likely that if you take two books from the same genre and the same period that if one has twice as many ratings as the other it will also have twice as many sales and that that prediction will be pretty accurate.

The annotated points in easier to read form.
(click for details)





     


14 comments:

  1. Very interesting. It would be nice to have even more books to verify this, but as a reader who has often wondered about sales, this is a great way to estimate the sales of my favorite books -- specifically English sales.

    ReplyDelete
  2. Thanks for sharing this Mark. Interesting stuff -- I've often wondered about this very question. The multiplier for my books seemed more like 6 than 7.7, but admittedly, I was just eyeballing...

    ReplyDelete
  3. Thanks for sharing this Mark. Interesting stuff -- I've often wondered about this very question. The multiplier for my books seemed more like 6 than 7.7, but admittedly, I was just eyeballing...

    ReplyDelete
  4. Interesting article. I guess it gets a bit chicken and egg but it would be interesting to see if goodread ratings correspond with sales too. eg if the rating goes up is there an upswing in sales and vice versa?

    Curious to see the possible effect of an ebook promo. I know I have stacks of ebooks that may potentially never be read but the impulse to buy a cheap book that doesn't take up physical space is always hard to resist.

    ReplyDelete
  5. I looked at this and thought, 'it probably doesn't work for newly released books with a lower number of ratings.'

    Just got a look at an early Royalty Statement (drastically different than a Royalty Cheque) and it's spot on.

    Damn.

    ReplyDelete
  6. I'd really like to see how this relates across audiences and what the difference is between genre and literary fiction, or maybe more specifically, I have a gut feel that books for smaller, more 'specialised' audiences might be more affected by this and the big middle of the road mainstream best sellers might be less affected by this.
    Also would be quite interesting to see how mainstream advertising affects this.

    ReplyDelete
  7. This doesn't work for me, Mark. Using books (all fantasy epic) published between 2011 and 2015, the ratio was between x11 and x22...and that was just for US sales, not English language worldwide. I have often wondered if being a woman writer affects ratings and reviews.

    ReplyDelete
    Replies
    1. Interesting. Out of more than a dozen authors you're the first to report a significant difference. Even the outlier reported in the blog turned out to have over-estimated his sales.

      It would be interesting to have more points and to check any male / female difference (Kameron Hurley provided data that fit the trend and she's female). When dealing with smaller numbers (and your post 2011 books have hundreds of ratings rather than thousands) we can expect more volatile behaviour, though I expected any bias on top of that volatility to be in the opposite direction to the one you indicate.

      Delete
  8. Nice work. The ratio is way lower than I expected. I would expect that far less than 10% of readers even know that goodreads exists. And of those, only a fraction will bother rating any given book. Comparing with computer games, the ratio is closer to 100 for Steam reviews, which most players know about and use to actually buy the game. I've also seen the Amazon book reviews to copies sold estimated at around 100. It would be interesting to see a deeper analysis of why these are an order of magnitude different.

    ReplyDelete
  9. Doesn't work for me either, Mark – maybe because most of my sales are in the UK and Australia. For my first book (1998) the ratio is 55. For my 2011 book it's 30. And for last year's (2016) book it's 122. They're all epic fantasy.

    ReplyDelete
    Replies
    1. Well, 1998 is clearly outside the bounds of the study and significantly predates Goodreads itself (2006). So no surprise there.

      And your 2016 book has fewer than 100 ratings, and as noted the statistics are volatile for small numbers. The article says several hundred is the minimum point at which one should consider using the ratio.

      For the only book to which I would advise applying the method you have a 30. I can't explain that, though as a scientist it _greatly_ surprises me and I would love to have access to a larger body of ground truth data.

      Your readers presumably read other fantasy and are drawn without bias from the general population of fantasy readers. Why then would they be statistically far less likely to register their opinion on Goodreads for your books than their fellows (& quite possibly they themselves) are for the books of other authors? ... a mystery.

      It would be nice to get raw data from a range of publishers over a much larger number of books. But that's never going to happen.

      One important point is that for the data in my post I approached authors. It was a fairly unbiased sample. For the data volunteered in the comments it is authors approaching me ... and who is most likely to take the effort to comment? Well, one group is authors for whom the formula doesn't seem to work. They are far more likely to want to comment than authors for whom it does work and who nod and move on (I have had a body of feedback to this effect in face to face conversations, on forums, etc). So what we see in these comments is a self-selected collection of outliers. Which can't of course be sensibly included in the data, but which might motivate the collection of more (non self-selected) data.

      Delete
    2. An additional thought... It might be that you have a body of loyal readers who formed an attachment to you in the late 90s and as an older generation are statistically less likely to be internet/Goodreads users.

      The authors in my study are (as far as I know) all first published in the last ten years and so have first recruited their readers in the Goodreads age / a period of far greater internet use.

      Delete
  10. Thanks Mark. Your second thought is bang on. My 11 Three Worlds epic fantasy novels were published between 1998 and 2008 and sold very well in Australia and the UK, where most of my fans still are. From talking to some other fantasy writers, the 7.7 factor doesn't seem to work so well for AU and UK sales. Interesting. But anyway, great article! Cheers, Ian.

    ReplyDelete
    Replies
    1. I sell as many books in the UK as in the US, and do pretty well in Australia, so I think it's likely to be more about when you got most of your readership rather than where you sell.

      Delete