28
Insights, analysis, and research about emerging technologies State of the Computer Book Market 2010 Copyright 2011, O’Reilly Media, Inc.

The State of The Computer Book Market 2010

Embed Size (px)

DESCRIPTION

Die Analyse von O'Reilly zum amerikanischen Markt für Computerbücher. Ein Markt, der auch wegweisend für die gerade wichtigen Technologien ist.

Citation preview

Insights,  analysis,  and  research  about  emerging  technologies  

State of the Computer ���Book Market

2010

Copyright  2011,  O’Reilly  Media,  Inc.  

Post 1

In the previous two years, since the last State of the Computer Book Market posts, the Tech Book market has been going through some major changes. Hopefully you will see some of the trends that cause change, through the faint signals that the book market provides.

You can get a quick refresher on how we see Computer Book Sales as a Technology Trend Indicator and our other posts on the State of the Computer Book Market.

The data is from Bookscan's weekly top 3,000 titles sold. Bookscan measures actual cash register sales in bookstores. Simply put, whenever you buy a technology-oriented book in the United States, there's a high probability it will get recorded in this data. Retailers such as Borders, Barnes & Noble, and Amazon make up the lion's share of these sales.

There will be five posts in total, which will be delivered every other day in the next week:

1. Post 1, Overall Market2. Post 2, Category Performance3. Post 3, Publisher/Imprint Performance4. Post 4, Programming Language Performance5. Post 5, Summary and Digital Sales

Overall Book Market Performance

Before we get to the specifics of the computer book market, let's get some context by looking at the whole book market for the week ending January 2, 2011. Everything that is printed, bound, and sold as a book, from The Girl with the Dragon Tatoo and Eat, Pray, Love to Decision Points and The Ugly Truth is represented in the table below.

Overall Book Market - EVERYTHING -Week Ending: 2011-01-02

All Books, All Subjects

Juvenile Non-Fiction -0.44%

Juvenile Fiction -3.46%

Total Juvenile -2.88%

Adult Non-Fiction -1.91%

Computers and Internet -3.99%

Adult Fiction -7.20%

Other -13.12%

Total Market -4.54%

As you can see, the computer market is down about 4% from last year. It should be noted that the computer book market makes up only about 1% of total unit sales in bookstores and online retailers. If you would like to see the performance of the major categories, this table shows percentage growth. I find it interesting that the Humor category is one of the largest-growing in an otherwise depressed market. The other growth area is Children's Non-Fiction Education/Reference -- and I certainly wonder why this category in particular is experiencing such strong growth.

Now on to the tech book market. The chart below gives some perspective into how each year stacks up against prior years. As you can see, there has been some serious erosion since 2007, our most recent high point. The sales in 2007 had many believing that the market was finally recovering from the post-2001 decline, but then 2009 showed the biggest drop from a prior year.

Immediately below is the weekly trend for the entire computer book market since 2004, when we first obtained reliable data from Bookscan. Please remember that the data represents all publishers, and not just O'Reilly. The slightly thicker red line represents the 2010 data.

Click to enlarge

As you can see, the clear seasonal pattern we've pointed out before still exists. That is, we have a strong start that declines through the summer, spikes for the fall "Back to School" season, and finishes strong. The trend line for each year closely mirrors the year before, with remarkably consistent weekly ups and downs. One trend that provides a bit of a silver lining in a fairly poor 24 months is that, 2009 averaged about 10-15% off of the prior year on a weekly basis, whereas in 2010 the market was about 3-6% off the prior year on a weekly basis. Could this indicate that the market has seen bottom, or are we just in a holding pattern as purchasers figure out how they want to acquire tech content? There are more selling options available now, as you will see in a later post, but the long and short of it is that many publishers are achieving more revenue and units through sales of the same content in a variety of digital formats. For some, the decline of print distribution is being offset by digital distribution and sales.

What you won't see on this chart is that the computer book market cratered in 2001, shrinking 20 percent a year for 3 years, until it stabilized in 2004 at about half the size it was in 2000. (We only have reliable data going back to 2004.) You can now see a second cratering in the market that started in the second half of 2008 and has continued through 2010. The overall market growth rates for the previous six years are: 2005 = 1.48%; 2006 = 3.17%; 2007 = -2.00%; 2008 = -4.27%; 2009 = -15.31%; 2010 = -4.29%

So what about that market was news in 2010? In 2010, there were 11 weeks that were ahead of the prior year unit sales. In 2009, there were only two weeks that were ahead of the prior year. So from that perspective, we have seen some signs of a recovery. 2010's overall growth finished at the market's 2008 level and declined much more slowly than 2009. To the optimists in the crowd, it appears as though we have seen bottom -- but pessimists will believe that they've seen this before: the market looks as though it hit bottom, but then takes another big hit downward. So it is really too unstable to predict whether it will move down again or continue to recover.

Another way to look at the market is with the Treemap visualization tool. This tool helps us pick up on trends quickly, even when looking at thousands of books. It works like this:

The size of a square shows the market share and relative size of a category, while the color shows the rate of change in sales. Red is down, and green is up, with the intensity of the color representing the magnitude of the change. The following screenshot of our treemap shows gains and losses by category, comparing the fourth quarter of 2010 with the fourth quarter of 2009.

So what are all the boxes and colors telling us? First remember that this is compares the last quarter of 2010 with the last quarter of 2009. This snapshot of the treemap looks less like the blood-bath of prior years (when there was red everywhere). There were quite a few bright spots (bright green) during the last quarter of 2010. Take a look at Android (in the upper-left box), with a 2,413% growth from the fourth quarter of 2009. You will also see Android in bright green (bottom-left corner) -- the difference is that the upper-left is for consumer books and the bottom-left is for Android programming books. Both had impressive growth in 2010 compared to 2009. In the upper-left corner is iPad which is black because there were no iPad books the prior year. However it is impressive how big the box is at this early point in its evolution.

In 2010, Windows 7 was the number one growth area for units, followed by iPad, then Android (for consumers), and Android programming. This is unit growth, and a bit of the success for these technologies is that they are fairly new and do not have large market shares as a base to be measured against. Looking at longer-established technologies, Security and Network Security and Digital Photography had strong unit growth.

I find it useful to organize the trends into classifications that are High Growth Categories bright green, Moderate Growth Categories dark green to black, Categories to Watch all colors, and Down Categories red to bright red. Most of these descriptions are self-explanatory, except perhaps Categories to Watch. This group contains titles that we've found are not typically susceptible to seasonal swings, as well as areas on our editorial radar. If there are categories you want to get on our watch list, please let me know.

The table below highlights and explains some of the data from the chart above, although the data is for all of 2010. The Share column shows the total market share of that category, and the ROC column shows the Rate of Change (RoC = (current_period - prev_period) / prev_period). So, for example, you can see that Mac OS books represent 2.95% of the entire computer book market, and were shrinking by 32.12% (RoC).

High Growth Share ROC Notes

Windows 7 05.53% 217.38% A large category that has finally taken over where XP and Vista titles left off. Windows 7 is a solid leader for operating systems.

iPad 01.74% xx.yy% This category had no 2009 presence, but is now the 7th largest category in market share.

Android Programming 00.68% 292.65% This category has grown steadily since the end of 2008, and is now the 42nd largest book category, and has the 3rd fastest RoC.

Usability 00.11% 491.50% This is not a huge area now, but its high RoC is accelerating its growth into a sizeable category.

Android 00.55% 2493.88% This is the consumer area for Android -- user guides, best apps, etc. Jumped from nothing in 2008, to solid growth in 2009, to top of the charts in 2010.

Moderate Growth Share ROC Notes

jQuery 00.41% 83.79% A good-sized category, where title output decreased from 9 titles in 2009 to 6 in 2010, yet unit sales grew steadily.

Cloud Computing 00.22% 63.58% A growing category that saw 8 new titles make the 2010 results. Titles are mostly introductory at this point in time.

Windows Administration 00.32% 36.29% A medium size category with 19 new titles in 2010, compared with 16 new titles in 2009 (the 2009 titles add to the units in 2010).

Social Web 00.21% 30.27% The titles in this category have doubled in the past two years, and 2010 had the biggest growth in number of titles, at 61%.

Network Security 00.90% 24.25% Security topics in 2010 have done well, with consistent output in the number of new titles.

Categories to Watch Share ROC Notes

Office Suites 2.71% -3.54% A very large category and usually consistent. Even though this category is down, it is not as down as the whole market.

Digital Photography 05.97% -17.20% A very large category (2nd after Windows), with 4 titles selling more than 10,000 units; 10 additional new titles in 2010 produced 64,581 fewer units.

Spreadsheets 02.99% -4.66% The third largest category, with 4 titles selling more than 10,000 units; 11 additional titles making the list in 2010 produced 8,754 fewer units than 2009.

Software Project Management 02.10% -15.14% A good-size and consistent category, though it is down in units sold. PMP and Agile PM seem to be the most popular titles here.

Down Categories Share ROC Notes

Flash 01.12% -84.43% Got a bruising from Apple, and this category went into the tank, losing 59,340 in 2010 compared to 2009; In 2009, lost 55,187 units compared to 2008.

Mac OS 02.95% -32.12% A large category with 59,668 fewer units sold in 2010. 67,642 fewer units were sold in 2009 than in 2008. Apparently Snow Leopard was a book-sales bust.

Web Design Tools 01.37% -53.20% This category took a beating, mostly because Dreamweaver CS5 did not sell at the same pace as the CS4 titles slowed down. 45,709 fewer units were sold in 2010.

Web Programming 02.01% -41.32% A good-size category with 16 fewer titles contributing to the category in 2010. In 2009, this category saw 56 titles selling more than 1,000 units, compared to 37 titles in 2010.

Web Page Creation 04.09% -27.37% A large category with only 4 titles producing more than 10,000 units in 2010 whereas 7 titles in 2009 hit that mark. There were 70,492 fewer units sold in 2010.

Post 2 in this series will provide a closer look at the technologies within the categories. Post 3 will be about the publishers, both winners and losers. Post 4 will contain more analysis of programming languages, and Post 5 will look at digital sales.

Post 2

In this second installment (the first post can be found here), we look at computer book sales in specific technology categories. Remember that we've organized the data into six "Category Families" � Systems and Programming, Web Design and Development, Business Applications, Digital Media Applications, Consumer Operating Systems and Devices, and Computer Topics. Within each of these Families are category group, super-category, category, and atomic category, in a five-level hierarchy. For example, Systems and Programming includes the category groups programming languages, databases, software engineering, general programming, security, and so on. In the rest of this post, we will contrast the final quarter of 2010 with 2009 as well as the whole year of 2009 with 2010.

As a refresher, here is a new treemap of the Category Families, with their sub-areas for the final quarters of 2010 compared to 2009.

Click to enlarge

This treemap shows a mix of red, green, and black, which basically reflects the fluctuating market. There is very little bright green (which represents fast growth). But again, remember that this is comparing the last quarter of 2010 with the last quarter of 2009. Two of the biggest and brightest green areas are Android Programming and Android Consumer both of which grew from tiny specks of boxes in 2008 to fairly sizeable areas in 2010.

In the next two images, you can see how our Category Families stack up. The image on the left shows the number of titles that made the top 3000 in a given year. Contrast that with the image on the right, which shows the number of units sold in each year. What you will notice is that the number of titles in Business Applications/Topics and Systems and Programming went up in 2010, yet the units sold for both Categories went down. Consumer Operating Systems and Devices was the only area that went slightly up in both the number of titles and units sold in 2010. Systems and Programming is the largest category, but its performance is more volatile, and is experiencing the largest overall decline. This category is the chief indicator for the health of the computer book market, and it's in consistent decline � for print books. You'll see some more positive indicators in my upcoming post on digital distribution.

Titles Units

The table below shows each Category Family's compared growth between 2009 and 2010 (YoY Growth), 2009 and 2010 ranking (09Rank/10Rank) and 2009 and 2010 percent of market share (09Share/10Share).

Category Families YoY Growth 09Rank 10Rank 09Share 10Share

Business Applications -05.10% 2nd 2nd 20.60% 21.00%

Computer Topics / Other 04.09% 6th 6th 02.82% 03.15%

Consumer Operating Systems 04.22% 4th 3rd 15.44% 17.27%

Digital Media -18.32% 5th 5th 10.66% 09.65%

Systems and Programming -03.32% 1st 1st 33.39% 34.62%

Web Design and Development -28.01% 3rd 4th 17.10% 14.32%

Before we look into categories further, let's first take a look at the words that make up all the computer titles for 2010. It's an interesting view of the words that the publishing industry puts on the front of books, online searches, and anywhere there is metadata about content. A note about this data: I threw away the stop-words like "the", "and," "it," "with," etc. I also

disregarded "Microsoft," since it is a descriptor used for various products and is redundant. Here is the 'title' view of the market.

Click to enlarge

When we drill into the category families a bit, we see that seven of our ten top categories (known as super-categories) sold fewer units in 2010 than in 2009, for a net loss of -244,936 units for just the top ten areas. In other words, our bigger and typically more stable areas were selling significantly fewer units in 2010. In the first half of 2010, there were 49 super category areas that were ahead in the sales over the first half of 2009, yet six of the 49 categories slowed down and ended up losing enough ground to show a year-over-year decrease in units. We ended up with 43 super-categories producing more units in 2010 than they did in 2009. The biggest winners in growth order are: Tablet, Mobile Programming, Windows Consumer, Security Topics, Hardware Topics, Social Web, Computers and Society, Cloud Computing, Information Technology, and Data Topics. The Tablet super-category went from roughly 15,000 units in the first half of 2010 to an additional 100,000 units in the second half of the year. An increase in titles fueled this growth � output tripled from 7 titles in the first half of 2010 to 22 titles by the year's end. The areas with the largest drop in units were, in descending order: Web Page Creation, Digital Photography, Mac OS, Flash, Web Programming, Web Design Tools, Personal Computers, Linux, Software Project Management, and Personal Database. The category that surprises me the most is Web Programming. Sixteen fewer titles in Web Programming area made the list in 2010, and only 7% of the titles sold more than 1,000 units, as compared to 11% in 2009.

As the market keeps declining, the response of many publishers is to increase the number of titles published, in an attempt to gain market share. Immediately below are two bar graphs showing the trend for how many titles made it into the Bookscan dataset in a given year, and the average units sold is for all titles. So this is the non-obvious point here: There are not necessarily more titles being published, but more titles making it into the data set. This could be attributed to a lower threshold to get in. In other words, some weeks the threashold to make the top 3000 list can be as low as 6 units sold. It is a relative measure. The last couple of years have had lower thresholds, and thus more titles made the list but with worse average units. When the market is healthy, the threshold moves up and only the solid-performing titles make it into the top 3000. The lower threshold barrier is resulting in a significant decrease in the average units per titles for all publishers. Out of the 22 largest imprints, 18 increased the number of titles that made the list in 2010. Yet only 6 of these 18 imprints with title increases also saw increases in their average units per title. The point again, is that the market can see more titles making the top 3000 list, but if the threshold is lower, the average units and overall units will be too.

Number of Titles Average Units

The table below provides a view of the market's erosion. The Average Min value represents the "low threshold" weekly average during a given year. The Average Max is the high-range weekly average for a given year. Number of Titles is self-explanatory. You will notice that the years with the highest min had fewer overall titles represented in the data. The bottom line is that as the market erodes, it appears as though we are seeing a watering-down � more titles producing poor results.

Year Average Min Average Max Number of Titles

2004 9.2 1,133 7,451

2005 9.6 1,099 7,123

2006 9.6 1,315 6,881

2007 9.4 1,348 7,092

2008 8.2 1,534 7,310

2009 7.3 1,057 7,557

2010 6.7 1,112 7,792

So it could be said that we've been in a bit of a tech innovation slump. Will any technology, platform, method, theory, or new-fangled invention stave off this market slump? Or will we continue the treadmill effect of more publishers chasing lost revenue with more titles, which merely replace existing units with marginal decreases? I think it is the latter. Something big needs to come along to drive a large increase in the market. I'm not convinced it is cloud computing, mobile, or social platforms even though those areas seem poised for future growth. What do you think will be the big growth areas in the next five years? Is there anything poised to make a big impact on the tech world?

Now let's look at the categories that comprise each category family. Below are some individual trend charts from our dashboard showing the 24-month period from January 2009 to December 31, 2010 for the major categories. By looking at a 24-month pattern, you get more insight into whether or not a particular area seems to be hit by seasonal factors, and if there is a steady decline/increase for the category. It is important to look at scale on these charts because it visually shows you the relative market size. Another way to think about it is if the trend line is high in the individual box, the category is big, and if it is low, it is a smaller category. What is interesting to note is that Consumer Operating Systems, Digital Media, and Business Applications and Devices all have a January spike, which is likely due to individuals buying "how to" books for their new computers, devices, and operating systems. This is a consistent seasonal pattern.

Computer Topics Digital Media Web Development and Design

Consumer Operating Systems/Devices Business Applications Systems and Programming

The Categories (24-month rolling, Janaury 2009 � December 2010)

Clicking on the charts below will produce a larger view. When viewing the charts below, keep the reference charts above in mind. Viewing these jointly provides more context on the size of market and seasonal patterns.

Category_Family: Consumer Operating Systems and Devices

Here are the trend lines for the four main categories (cat_family) that make up Consumer Operating Systems and Devices.

This category is a medium-sized area and was the one of two Category Families to show growth year-over-year. This category's growth is driven by Windows 7 and Port Dev (Portable devices). Port Dev was dominated by Android in 2010 and iPhone in 2009. And remember from earlier that the Tablet area moved quickly up the charts in the second half of 2010. We see a new title and topic leading the way this year with Windows 7 For Dummies by Andy Rathbone. Two other Windows 7 titles are in second and third place. The perennial leaderMac OS X Snow Leopard: The Missing Manual fell to fourth, as Snow Leopard was not a huge OS release by Apple and the topic did not drive this category as it had in previous years.

This market has shown growth because of the explosive growth of Mac OS X, but if you compare with Windows books, the Windows books are the steady sellers and have the growth in the last two years. The chart below shows how these two are stacked up against each other. Are you a PC or a Mac? The chart below says more of you are PCs!

Click to enlarge

Category_Family: Business/Office Applications

When comparing the Business Apps area for 2009 and 2010, there were 8 super_cats (one level below cat_family) that performed ahead of the prior year and 23 that underperformed compared to the prior year. Unfortunately the 23 underperforming super_cats lost 67,000 more units than the 8 positive areas had gained, for an overall -5.10% growth rate.

The two healthiest super categories were Spreadsheets (Excel) at +2.42% growth, and Social Network (Facebook) at +11.49% growth, while Graphics Applications at -16.80% and Ecommerce at -47.00% were the two biggest laggards. What surprised me the most was that the Content Management Systems category did not grow, as I had thought it would. So I dug a bit, and discovered that most of the growth in CMS as a category occured between 2006 and 2009. During the past two years, the category has held its own and performed better than the overall market decline. For a view of CMS growth, click on the chart below.

Click to enlarge

Here are the trend lines for the three main categories that make up Business/Office Applications.

Click to enlarge

Notice how much bigger of a category "office" is than the other two ("gen bus app" & "design".) But the news in this category is that Office titles have taken a slight downturn, having gone from 196,722 units in 2009 to 187,968 units in 2010, a -4.66% growth. This growth/decline mirrors the overall market. The category has been dominated by dummies... Dummies books that is. In 2010, the top three titles were Dummies and seven of the top ten were Dummies. This does make sense when you think about it. Learning to use a tool like Excel is not rocket science, and the Dummies books appeal to a broad group of people, ranging from the technically literate to techno phobes. In this area, it seems like Dummies have a bit of a book dynasty, so to speak.

Category_Family: Web Design and Development

Web Design and Development is down -28.01% from 2009 to 2010. More than 251,000 fewer units were sold in this category in 2010 than in 2009. And remember, 2009 was the worst year we've seen in awhile. There were only two sub areas that showed growth in this category � JavaScript and the Social Web. JavaScript showed a healthy 7.81% growth and the Social Web grew by 7.18%. Our Learning PHP, MySQL, and JavaScript led the category in unit sales. The area that suprised me the most, though, was Web Page Creation which saw ~70,000 fewer units sold in 2010 than in 2009 (and again, 2009 was a stinker of a year). Are people moving on from HTML and its like to PHP, JavaScript and CMSs? Or are more people interested in making mobile apps that access their web pages?

Here are the trend lines for the three main categories that make up Web Design and Development.

Obviously the big sub category here is "web site". It is dominated by titles that talk about performance, scability, reliability, and tuning like what you can find at our Velocity Conference or in this bundle of references. Rich Web Interface moved to second among these categories, but is experiencing declines. In the RWI space, both Flash and Silverlight had fairly significant declines. Flash declined -84.43% while Sliverlight declined -8.29%. But the Flash subcategory is currently about four times as large as Silverlight. Could it be that HTML5 makes these two technologies seem kind of moot?

Category_Family: Systems and Programming

This is the largest of our top-level category families. It is the place where most of the programming language, database, and software development titles reside. The normal trend here is the category to get off to a good start early in the year, and then have another peak around September (when college students go back to school). There are now 67 super_cat subcategories in this area. In 2010, 44 of the areas were negative year-over-year and only 23 areas had growth � when you add the negative and the positive areas, there were -72,024 fewer units sold in these areas during 2010. This is only a -3.32% decline, so this large family of titles actually did better than the overall market. The top five performing categories, in order, were Mobile Programming, Security Topics, Cloud Computing, Information Technology, and Data Topics. The categories with the worst performance, in order, were Linux, Software Project Management, Personal Database, Visual Basic, and SQL Server. In the top performing area of Mobile Programming, iPhone Programming led the way for growth in 2009, while Android led in 2010. Remember this is not the consumer market of books about how to use an iPhone or Droid, but the programming market � iOS was nine times as large as Android in 2009, and roughly 2.5 times as large of a category in 2010.

Here are the trend lines for the first set of three, of the nine main categories that make up Systems and Programming.

Click to enlarge

Note the scale of the overall category. Programming languages have consistently been the largest category group; the category "prog" has come from a distant third to the number two super_cat in this area. Databases have been consistently declining for about three years now. As mentioned earlier, Software Project Management was one of the biggest losers of 2010, yet it was also the third-largest super_cat in Systems and Programming, preceded by Mobile Programming and Security Topics. However, these latter two showed positive growth, compared to SPM's decline. Another area that came from nowhere and is now a healthy super-category is Data Topics. Many of these titles are similar to the talks, sessions, and writings found at our Strata Conference and Data Science resources.

The second set of three trend line charts are healthy and show less volatility when compared to category groups from other Category Families. Their trend is flat, yet consistent with the seasonal swings of the market.

Click to enlarge

When comparing the whole year of 2009 to 2010, the Software Engineering group is the largest of the second set of three. It is led by a classic title in The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition and a new classic in Coders at Work: Reflections on the Craft of Programming. The Network category is dominated by CompTIA titles and holds five out of the top ten spots in the category, including the top two spots.

The third set of trend lines were driven by CISSP, Intrusion topics, and CompTIA Security.

Click to enlarge

Next up, Post 3 will be about the publishers, winners and losers. Post 4 will contain more analysis of programming languages. And Post 5 will look at digital sales.

Post 3

In this third installment, (see Post 1 and Post 2; Post 4 & 5 to come soon), we will look at how publishers fared in 2010, as compared to 2009. The chart below shows our dashboard view of the large publishers' results for 2010. The most notable piece of information is that Wiley continues to hold the leading spot as the largest publisher (with 32% market share of units sold), while Pearson and O'Reilly both lost 1%, which is picked up by Cengage and McGraw Hill. (We'll look at revenue share later in the analysis.)

2009 Pub Share 2010 Pub Share

You may not recognize the names of all the top publishers, because they are actually conglomerates of many smaller publishing imprints that they've acquired, created or distributed over the years. The imprints are the familiar consumer-facing brands. For instance, when you purchase a book from Peachpit or Sams, you typically see Peachpit or Sams on the spine, not Pearson, even though Pearson owns both companies. In O'Reilly's case, all the imprints that are not branded "O'Reilly" are part of a distribution partnership and are not owned by O'Reilly. The various imprints that make up each major publisher's share are shown in detailed pie charts later in this post.

Let's look at the top publishers and how they performed year-over-year. The following table provides some interesting comparative data.

Publisher 2010 Units 2009 Units 2010 Title Count 2009 Title Count 2010 Efficiency 2009 Efficiency

Wiley 1,887,493 1,904,859 1,538 1,468 1.65 1.61

O'Reilly 1,404,607 1,577,838 1,145 1,185 1.65 1.65

Pearson 1,386,301 1,511,855 1,934 1,936 0.97 0.97

McGrawHill 276,439 255,667 466 454 0.80 0.70

Apress 200,267 212,614 423 389 0.64 0.68

Cengage 167,020 143,521 676 644 0.33 0.28

Reed Elsevier 140,708 128,657 384 355 0.49 0.45

Lightning Source 67,620 61,676 412 325 0.22 0.23

Sum/Avg 5,530,455 5,796,687 6,978 6,756 0.84 .82

So what is notable in this data? First, the big three publishers (more than 1 million units per year) are all down. Second, four of the top eight publishers are up: McGraw Hill, Cengage, Reed Elsevier, and Lightning Source all had modest gains in 2010. Overall, these top eight publishers collectively saw 266,232 fewer units sold in 2010, with 222 additional titles making the list.

A note on Market Share versus Title Efficiency

A typical indicator of publisher performance is market share of units sold, which is what we've been looking at so far. Perhaps a better measure is how many published titles it takes to get a comparable share of unit sales. This is the ratio of title share to unit market share. Think about it this way: if a publisher has 15% of the titles appearing in the Bookscan Top 3000, and gets a 15% share of units sold, they will have a ratio of 1:1, expressed as a title efficiency of 1.0. A publisher with 20% of the title share, and 10% of the unit share would have a .5 efficiency. An efficiency of 1 is the market average: 100% of the title count delivering 100% of the unit sales. A publisher that achieves its share with fewer titles will have a higher ratio. Only two publishers continue to have an efficiency of more than 1: Wiley and O'Reilly.

Publishers under the 1.0 threshold typically have many titles in the Bookscan data, but they are not selling many units. A note of caution though�some publishers have many evergreen titles, which can skew this data. Typically, older titles sell fewer units each subsequent year. But this is not always true, as some titles continue to sell like they are newly released. Head First Design Patterns is one example, still selling more than the majority of brand-new titles. So efficiency could be thought of as a frequency ratio rather than a true efficiency measure, because it is very efficient to publish a title and have it sell for years. A true efficiency metric would take into account all titles published by all publishers and how many make it into the top 3000. Some publishers have titles that never even make the top 3000, so we will not be able to count them (for or against an efficiency metric) because they are missing from the datasetset.

A Note on Evergreen Status

The table below shows imprints that have a percentage of evergreen titles. And how did we come up with an evergreen status? We assigned points to titles that had copyright dates older than 16+ years (most points), 10-16 years, 5-10 years, and less than 5 years (no points because they have not proved to be long-lasting yet). After assigning points for each title, we were able to see what percentage of evergreen titles each imprint had in the top 3000 between the years 2004 and 2010. It's interesting to note, and somewhat expected, that the top three evergreen imprints have a strong academic heritage (Wiley, Addison-Wesley, and Prentice Hall).

Imprint % Evergreen

John Wiley 11.47%

Addison-Wesley 11.07%

Prentice Hall 10.85%

O'Reilly 9.63%

For Dummies 8.43%

Sams 7.79%

Que 7.01%

Peachpit Press 5.62%

McGraw-Hill/Osborne 5.50%

Wrox 3.66%

APress 2.92%

Now that we have a basic understanding of title efficiency and evergreen status, let's look further into the 2010 results for the imprints and drill in on the top three publishers: Wiley, Pearson, and O'Reilly. This is important because you typically see the imprint name on a book when you purchase it, but may not be aware of who the publisher is. (You'll likely see the publisher inside the book on the copyright page, except in the case of O'Reilly because our other imprints are distribution partners. That is, O'Reilly provides some sales and distribution services to these partners, but they are not owned as is the case for Pearson and Wiley imprints.)

Click on any chart to get a bigger image.

#1 Wiley #2 O'Reilly Media

#3 Pearson

In 2010, O'Reilly Media became the second largest publisher with all imprints aggregated under the O'Reilly partner and distribution umbrella. In this data, I have included all partners and roughly the year they were added to their respective conglomerate. For the most part, our agreement with Microsoft Press is what catapulted O'Reilly Media to become the second largest tech publisher, though only by a slight margin ahead of Pearson (less than 15,000 units). Wiley continues to dominate as the largest publisher and seems be more resilient to the market declines, chiefly due to the Dummies brand and its wide-ranging scope. The trend chart below shows the three main tech book publishers and their respective growth by year.

Now that you have an idea of the imprints that make up the largest three publishers, let's tease out all the imprints and look at their respective market share. The following chart shows the top 20 imprints and how they stack up against each other. These ten imprints saw 422,814 fewer units sold in 2010�and remember that 2009 was not a strong sales year for tech books. The market was held together by the medium-to-small size publishers that you do not see on this list. From this imprint view, you'll notice that O'Reilly has the second largest market share behind Dummies.

So what do the graphs tell us? The first notable thing is that there was very little movement in the top ten imprints. In other words, the imprints that were occupying the top slots still do. In fact, of the top 20 imprints, there were only 7 that showed a slight increase in units compared with 2009 and 8 that showed an increase in dollars (see the graph below). The only movement out of the top ten was Apress, which dropped from #10 to #11. Sybex, Wrox, and then Course Technology (in that order) showed the biggest increase in units from 2009 to 2010 and Wrox, Course Technology, and then Sybex (in that order) showed the highest growth percentage. Basically, Sybex had a larger base to grow from, so the overall share growth was not as significant for them. The other two were half the size and their unit growth made an impact on their growth percentage.

Before analyzing imprints by category, let's revisit the data with dollars rather than units. We have a fairly easy way of calculating this: units sold * listprice = dollars. Granted there are discounts, promotions, and other things that affect the precision of this, but it is pretty close. If nothing else, you can think of this as retail value. So here are the top imprints from a revenue perspective. Again, this is at the imprint level and from a dollar perspective. As you can see, compared to the units chart above, the leaderboard quickly changes. Microsoft Press becomes the number one revenue-producing imprint, followed by O'Reilly, and then Dummies. The biggest move in the top 20 is that Addison-Wesley jumps from #8 in units to #4 in dollars, and conversely, Wiley's Visual imprint goes from #10 in Units to #17 in Dollars.

Imprint Analysis by Category

Now that we have seen a high-level picture of what imprints did in 2010, let's take a look at which categories each of them publishes in and where their strengths lie. Dummies and O'Reilly appear to have the most diverse publishing programs, as they are not at the bottom in any category. Dummies is clearly the leader in Business Apps and Consumer Operating Systems, while O'Reilly has climbed further ahead of Microsoft Press in the Systems and Programming category. This chart also seems to indicate that Addison-Wesley is really only publishing in the System and Programming space. That is what some publishers do: they have specific imprints publish in one or two categories only. The jury may still be out on whether that is a good or bad strategy, but Addison-Welsey's success in revenue growth in 2010 could be because of this focus. The corollary is when there is not much new tech driving one area, a publisher/imprint may become more susceptible to market declines because of the lack of diversification.

Imprints' Category Strength

Categories and the Publishers who Dominate Them

The following category images are for 2010, and the tables have each publishers' count of titles and sum of units. The top titles are also listed for each area in 2010.

Category: Systems and Programming

In this category, you can see that O'Reilly now has the largest market share among the publishers, with Pearson a close second. If we drill into the imprint level, the picture of who is driving this gets clearer. The top six imprints are O'Reilly at 13.66%, Microsoft Press at 10.26%, Addison-Wesley at 9.65%, For Dummies at 7.04%, Apress at 6.67%, and Prentice Hall at 5.48%. What is not obvious from this data is that the top publishers all have come down a couple of percentage points, which means that the market is getting its growth in the middle of the pack.

As you can see in the table below, O'Reilly has the most units and best title efficiency rating. It is a relatively healthy mix. That is, we have quite a few titles, but our efficiency is also significantly above the market average.

Sys & Prog - Publisher Market Share (01/01/2010 � 12/31/2010)

Publisher Units Title Count Units/Title Efficiency

O'Reilly 557,876 637 876 1.63

Pearson 525,897 960 548 1.02

Wiley 386,809 553 699 1.30

Apress 114,705 236 512 0.95

McGraw Hill 111,980 236 474 0.88

Cengage 64,580 252 256 0.48

Lightning Source 34,674 188 184 .34

Reed Elsevier 31,957 188 170 0.32

Note: This category family contains "programming languages" and "programming", where more units were sold in 2010 than in 2009. The leading titles and publishers for Systems and Programming in 2010 were:

1. PMP Exam Prep: Rita's Course in a Book for Passing the PMP Exam (this is the perennial leader) (RMC)2. MCTS Self-Paced Training Kit : Configuring Windows 7 (Microsoft Press)3. CISSP Certification All-in-One Exam Guide, 5th Ed. (McGraw Hill)4. CCNA Official Exam Certification Library, 3rd Ed. (Cisco Press)5. Head First Java, 2nd Edition (O'Reilly)

Category: Web Design and Development

In this category, you can see that Wiley has the largest market share among the publishers, with Pearson second. If we drill into the imprint level, the picture changes a bit. The top six imprints are O'Reilly at 21.11%, Dummies at 13.30%, Sams at 6.25%, Wiley at 5.92%, New Riders at 5.82% and Peachpit at 5.40%.

As you can see in the table below, Pearson has the most titles and their performance is strong in this category. In Web Design and Development, it used to be that most of the top publishers were above the title efficiency average of 1.0, but now there are only three top publishers over the 1.0 efficiency threshold. (This suggests that there are a lot of second-tier publishers with lower efficiency who don't show up in the table.) The category experienced a decline of about 25,000 fewer units, yet saw 21 additional titles make the list in 2010. This contributes to the decrease in category/publisher efficiency.

Web Des & Dev - Publisher Market Share ( 01/01/2010 � 12/31/2010 )

Publisher Units Title Count Units/Title Efficiency

Wiley 234,853 219 1,072 1.48

Pearson 232,547 287 810 1.11

O'Reilly 221,876 211 1,052 1.45

Apress 48,812 124 394 0.54

Lightning Source 16,345 104 157 .22

The leading titles and publishers for Web Design and Development are:

1. Don't Make Me Think: A Common Sense Approach to Web Usability, 2nd Ed. (New Riders' )2. Head First HTML with CSS & XHTML (O'Reilly)3. CSS: The Missing Manual (O'Reilly)4. HTML, XHTML, and CSS: Visual Quickstart, 6th Ed. (Peachpit)5. WordPress For Dummies: 2nd Ed. (Wiley)

Category: Business Applications

In this category you can see that Wiley has the largest market share among the publishers and O'Reilly (Apologies: the O'Reilly 21% is obscured in the graph.) has moved ahead of Pearson for second. If we drill into the imprint level, the picture changes a bit. The top six imprints are Dummies at 28.34%, Microsoft Press at 14.72%, McGraw Hill/Osborne at 7.06%, O'Reilly at 6.29%, John Wiley at 5.74%, and Que at 4.18%.

Bus Apps - Publisher Market Share ( 01/01/2010 � 12/31/2010 )

Publisher Units Title Count Units/Title Efficiency

Wiley 566,391 386 1,467 1.74

O'Reilly 278,058 162 1,716 2.04

Pearson 199,261 296 673 0.80

McGraw Hill 96,806 121 800 0.95

Cengage 102,034 75 1,360 1.2

Cengage 35,625 200 178 0.21

Apress 16,370 38 431 0.51

The leading titles and publishers for Business Applications are:

1. Facebook For Dummies (Wiley)2. Office 2007 All-in-One Desk Reference For Dummies (Wiley)3. Excel 2007 for Dummies (Wiley)4. Microsoft Office Excel 2007 Step by Step (Microsoft)5. QuickBooks 2010 The Official Guide (McGraw Hill)6. Excel 2007 All-In-One Desk Reference For Dummies (Wiley)

Category: Consumer Operating Systems

In this category, you can see that Wiley has the largest market share (at a whopping 46%), among the publishers with O'Reilly again in second at 22%, and Pearson comfortably in the third spot at 17%. (Apologies: the O'Reilly % is partially cut off in the graph.) If we drill into the imprint level, the picture changes a bit. The top five imprints are Dummies at 31.00%, O'Reilly at 13.15%, Que at 9.39%, Microsoft Press at 8.52%, and Wiley's Visual at 7.10%.

As you can see in the table below, Wiley has the most titles and a relatively good efficiency rating, and O'Reilly also has a very healthy title efficiency rate. What is impressive with this category, is that it has four of the top six imprints averaging more than 1,000 units per title. To me, that means it's a category that sustains numerous big-seller titles, not just an occasional retail success. As you can see from the best-selling titles below, it is mostly Windows 7 that is driving this category, though iPad: The Missing Manual came from virtually nowhere to be among the bestsellers in 2010.

Cons Opsys & Dev - Publisher Market Share ( 01/01/2010 � 12/31/2010 )

Publisher Units Title Count Units/Title Efficiency

Wiley 490,682 182 2,696 1.48

O'Reilly 237,508 62 3,831 2.11

Pearson 184,497 129 1,430 0.79

McGraw Hill 48,684 50 974 0.54

Cengage 37,679 86 438 0.24

Computer Step 37,528 24 1,564 .86

The leading titles and publishers for Consumer Operating Systems are:

1. Windows 7 For Dummies (Wiley)2. Windows 7 For Dummies Book + DVD Bundle (Wiley)3. Windows 7 Plain & Simple (Microsoft Press)4. Mac OS X Leopard: The Missing Manual (O'Reilly) 5. Windows 7 Step by Step (Microsoft Press)6. iPad: The Missing Manual (O'Reilly)

Category: Digital Media

In this category, you can see that Pearson has regained the top spot, with Wiley falling to second. As you can see in the table below, Pearson has the most titles and a relatively good efficiency rating. O'Reilly also has an extremely healthy efficiency rate and average units per title. This relates to my earlier comment about publishing fewer titles, while getting more out of the ones that you do publish. For instance, in the table below, Reed Elsevier has moved into third for publishers, yet this is largely due to twice as many titles as O'Reilly and hence have a much lower efficiency rating. If you factor in that Reed Elsvier also made 1/4 of their units in one title, their efficiency is a little deceiving.

If we drill into the imprint level, the picture changes a bit. The top six imprints are Peachpit Press at 18.19%, For Dummies at 15.97%, Focal Press at 12.91%, O'Reilly at 11.18%, Adobe Press at 9.91%, and New Riders at 7.73%.

Digital Media - Publisher Market Share ( 01/01/2010 � 12/31/2010 )

Publisher Units Title Count Units/Title Efficiency

Pearson 228,640 186 1,229 1.27

Wiley 173,616 157 1,106 1.14

Reed Elsevier 76,359 110 694 0.72

O'Reilly 74,235 36 2,062 2.13

Cengage 15,027 47 320 0.33

The leading titles and publishers for Digital Media are:

1. Adobe Photoshop CS5 for Photographers (Focal Press)2. The Adobe Photoshop CS5 Book for Digital Photographers (New Riders)3. Adobe Photoshop CS4 Classroom in a Book (Adobe Press)4. Adobe Photoshop CS5 Classroom in a Book (Adobe Press)5. Photoshop Elements 8 for Windows: The Missing Manual (O'Reilly)

Next up, Post 4 will contain more analysis of programming languages. And Post 5 will look at digital sales.

Post 4

In this fourth post (posts one, two and three are found here) on the State of the Computer Book Market, we will look at programming languages and drill in a little on each language area.

Overall, the market for programming languages was down -6.27% in 2010 when compared with 2009. There were 6,303,125 units sold in 2009 versus 5,931,452 units sold in 2010, which is a decrease of -371,673 units. Java experienced the biggest gain in units, at 28,633 more units in 2010 than 2009, while PHP occupied the opposite end with the biggest decrease at 38,614 fewer units year-over-year.

Before we begin to drill in on the languages, we thought it would be best to explain our "language dimension." When we group books by their language dimension, we categorize them by the language used in their code examples. So Flash Programming with Java would be in our Flash atomic category, but the language dimension would be Java. Similarly, our Head First Design Patterns book contains examples written in Java, so it too carries the "java" tag on the language dimension.

To provide some perspective, 2009 and 2010 have been the worst two years for book sales in the category of programming languages. The chart directly below does not include books that are method-oriented, about project management, about Consumer Operating Systems, or books without language-oriented material. So this is a different view of the market than the overall view found in Post 1 of this series. In the chart below you can see all languages on a week-by-week basis while showing that the Years 2009 and 2010 are consistently below prior years.

In 2008, we reported that C# surpassed Java as the number one language. But hold on, Java proved to be resilient in 2009 and experienced a resurgence in 2010 and is now the number one language from a book sales perspective. As you can see in the 2010 Top 20 langugages chart below, Java has a significant lead in the language race with Objective-C moving into third place closely behind C#.

2010 Market Share

If you look at the chart below, you will see which languages were responsible for the most units sold between 2004 and 2010. Newer languages, or "fad" languages may not be as well represented because they had less time to generate more significant units in our data set. The chart is basically the sum of units for each language during this time period. The top ten languages generated unit sales of 7,655,365 for the 7-year period, while the second ten generated 1,919,691 in the same period. The top ten languages represented roughly 80% of units sold during this period. Looking at the 7-year trend for the languages, you can see that C# had been steadily growing until 2009 while Java had been going in the opposite direction during the same period. In addition to Java, VBA, VBScript, SAS, Javascript, C++ and C showed growth from 2009 to 2010. The other 13 languages showed declines when comparing 2009 to 2010.

A Treemap View of the Programming Languages

In the treemap view above, which compares the last quarter of 2010 with the last quarter of 2009, you'll notice a lot of bright green areas, several solid green areas and a fair share of black and red areas. The main reason Objective-C is down 12% is that it had a tremendous 2009, which was hard to sustain. The language came from a small speck on this treemap view, to occupying a fairly sizable square.

Before we dive in, let's look at the high-level picture for the grouping of languages. I have grouped these languages by total number of units sold between 2004-2010. As you can see in the table below, only the Mid-Major group experienced growth in 2010, while the rest showed declines. The language driving the most growth in the Mid-Major area was R. An interesting observation is that the statistical languages, much like those you would be exposed to at our Strata Conference, are experiencing substantial growth. Namely, R, SAS, Matlab, Labview, Mathematica, and SPSS have collectively seen an increase of 49,504 units, or a whopping 102.87% growth. Maybe Hal Varian's quip about Statistics being the "sexy job of the future" is motivating developers to learn these languages.

Group Unit Range Y2010 Units Y2009 Units Y2010 # Y2009 # 10MketShar 9MketShar

Large 50,000 � 200,000 1,051,945 1,069,762 1,590 1,433 75.96% 75.00%

Major 10,000 � 49,000 227,306 254,587 450 456 16.41% 17.85%

Mid-Major 3,000 � 9,999 53,152 44,909 104 85 3.84% 3.15%

Mid-Minor 1,682 � 2,999 20,818 20,965 61 58 1.50% 1.47%

Minor 1,000 � 1,680 13,000 15,517 46 31 0.94% 1.09%

Linelist 399 � 999 6,299 6,350 25 19 0.45% 0.45%

TheRest < 399 3,370 6,368 49 43 0.24% 0.45%

For the sake of grouping and presenting this information in a more readable format, we have classified the categories for the languages in this way with the following headers:

*Large* U N I T S T I T L E S M A R K E T S H A R E

1. Language 2. 2010 Units 3. 2009 Units 4. 2010 Titles 5. 2009 Titles 6. 10Mkt Share 7. 09Mkt Share

1. Name or short name of the language2. Units sold in 20103. Units sold in 20094. Number of Titles making Bookscan 3000 in 20105. Number of Titles making Bookscan 3000 in 20096. 2010 Market Share7. 2009 Market Share

The following table contains data for the Large languages. As you can see, 5 of the 10 top languages experienced growth in 2010 and were led by Java's impressive turnaround. As you may remember from previous posts, Java was on a steady decline in units sold, at least until 2009 and continuing through 2010. Could Android development be fueling this Java resurgence? Eventhough Objective-C experienced a decline in 2010 compared to 2009, it is amazing that it made the top ten. Previous rankings had the language near the 20th spot. Javascript continues its steady growth as it solidifies its spot as the most used/important language for web programming.

Large Programming Languages � 50,000 � 195,000 units in 2010

*Large* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

Java 194,520 165,887 361 332 13.90% 11.54%

C# 153,469 156,043 263 230 10.97% 10.86%

Objective C 136,711 141,608 89 51 9.77% 9.85%

JavaScript 131,850 115,107 169 157 9.42% 8.01%

PHP 106,952 145,566 163 152 7.64% 10.13%

C/C++ 94,268 93,067 192 184 6.74% 6.48%

VBA 61,108 48,507 68 58 4.37% 3.38%

ActionScript 60,578 83,017 96 85 4.33% 5.78%

Python 58,905 60,700 94 84 4.21% 4.22%

SQL 53,584 60,260 95 100 3.83% 4.19%

Here are the top titles for the Large languages. Incidentally, the titles and order are the same whether you look at units sold or dollars generated, except that the WordPress title falls out of the top five and Addison-Wesley's PHP and MySQL Web Development moves to #5:

O'Reilly Learning PHP, MySQL, and JavaScript, First Edition

O'Reilly Head First Java, Second Edition

Wrox Professional Android 2 Application Development

Addison-Wesley Programming in Objective-C 2.0

Dummies WordPress for Dummies (covers PHP)

You'll notice in the Major languages that C, Powershell, ShellScript, and VBscript all had growth. Overall, these languages sold roughly 27,000 fewer units in 2010 compared to 2009. That equates to a 12% decrease for the Major languages.

Major Programming Languages � 10,000 � 49,999 units in 2010

*Major* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

.NET Languages 44,958 57,286 82 78 3.25% 4.02%

Visual Basic 42,225 55,574 88 94 3.05% 3.90%

C 36,638 34,820 91 83 2.65% 2.44%

Ruby 20,004 29,977 48 63 1.44% 2.10%

Powershell 18,652 12,124 26 19 1.35% 0.85%

Transact SQL 17,507 17,601 28 29 1.26% 1.23%

Perl 15,606 20,030 32 34 1.13% 1.40%

Pl/Sql 10,670 10,974 24 26 0.77% 0.77%

Shell Script 10,720 7,482 20 17 0.77% 0.52%

VBScript 10,326 8,719 11 13 0.74% 0.61%

Here are the top titles for the Major languages.

Prentice Hall C Programming Language

Prentice Hall Practical Guide to Linux Commands, Editors, and Shell Programming

O'Reilly Learning Perl, 5th Edition

Morgan Kaufman Programming Massively Parallel Processors: A Hands-on Approach (C language)

Pragmatic Agile Web Development with Rails, Third Edition

Mid-Major Programming Languages � 3,000 � 9,999 units in 2010

The news in this category is that the statistical languages are doing really well. As noted above, these languages have grown by 102.87% from 2009 to 2010. The most impressive growth is for the eight titles for the R language: the overall category is led by R in a Nutshell.

*Mid-Major* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

SAS 9,035 7,974 27 21 0.65% 0.56%

SPSS 8,973 6,818 16 10 0.65% 0.48%

MatLab 7,857 6,752 22 17 0.57% 0.47%

R 7,800 2,817 15 12 0.56% 0.20%

Processing 6,996 6,038 8 6 .51% .42%

Shell Script 6,073 7,116 19 16 .44% .50%

Basic 5,540 5,277 7 9 .40% .37%

Lua 4,677 5,570 7 6 .34% .39%

Assembly 4,391 4,359 18 14 .32% .31%

MDX 3,890 4,838 8 8 0.28% 0.34%

UnrealScript 3,028 2,440 3 3 .22% .17%

Here are the top titles for the Mid-Major languages.

O'Reilly R in a Nutshell: A Desktop Quick Reference

Prentice Hall Using SPSS for Windows and Macintosh: Analyzing and Understanding Data

SAS Press The Little SAS Book: A Primer, Fourth Edition

Open University Press SPSS Survival Manual: A Step by Step Guide to Data Analysis Using SPSS for Windows

Sams Mastering Unreal Technology, Volume I: Introduction to Level Design with Unreal Engine 3

Mid-Minor � 1,682 � 2,999 units in 2010

The news in this category is the growth of functional languages, like F#, Scala, and Lisp. These languages showed a nice 51.38% year-over-year growth and generated 7,648 units in 2010, compared to 3,718 units in 2009.

*Mid-Minor* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

F# 2,905 1,095 6 5 0.21% 0.08%

Scala 2,531 3,946 5 5 0.18% 0.28%

Groovy 2,452 3,972 7 8 0.18% 0.28%

Alice 2,441 2,472 10 9 0.18% 0.17%

Blitzmax 1,836 2,603 2 2 0.13% 0.18%

AppleScript 1,787 3,994 4 6 0.13% 0.28%

VHDL 1,785 1,733 18 15 0.13% 0.12%

Bash 1,715 183 2 1 0.12% 0.01%

Lisp 1,684 309 4 6 0.12% 0.02%

LabView 1,682 658 3 1 0.12% 0.05%

Here are the top titles for the Mid-Minor languages.

Prentice-Hall Learning To Program with Alice

Artima Programming in Scala: A Comprehensive Step-by-step Guide

No Starch Press Land of Lisp: Learn to Program in Lisp, One Game at a Time!

Prentice-Hall LabVIEW 2009 Student Edition

Manning Real World Functional Programming: With Examples in F# and C#

Minor Languages � 1,000 � 1,680 units in 2010

This category of languages saw 6 of the 10 languages sell fewer units in 2010. There was roughly a 20% decrease in units sold year-over-year. The bright spot was the performance of Mathematica, mostly fueled by the Mathematica Cookbook. This area is dominated by functional languages like the previous category, however, these languages are not experiencing the substantial growth.

*Minor* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

Mathematica 1,675 900 9 4 0.12% 0.06%

Erlang 1,513 2,276 3 2 0.11% 0.16%

Scheme 1,479 1,364 8 7 0.11% 0.10%

FBML 1,367 2,335 5 4 0.10% 0.16%

Clojure 1,332 1,460 2 1 0.10% 0.10%

AWK 1,200 1,642 2 2 0.09% 0.12%

Nxt-g 1,172 969 4 1 0.08% 0.07%

Scratch 1,112 674 2 2 0.08% 0.05%

Latex 1,099 1,623 6 5 0.08% 0.11%

Haskell 1,051 2,274 5 3 0.08% 0.16%

Here are the top titles for the Minor languages.

O'Reilly Mathematica Cookbook

O'Reilly ERLANG Programming

O'Reilly Real World Haskell

Pragmatic Programming Clojure

O'Reilly sed & awk

Linelist � 399 � 999 units in 2010

This category of languages saw 6 of the 10 languages sell more units in 2010, although the sales volume is fairly insignificant. There was roughly a -0.81% decrease in units sold year-over-year. I am not going to list the bestsellers, because they are not exactly bestsellers in this sort of category.

*Linelist* U N I T S T I T L E S M A R K E T S H A R E

Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share

Tcl 965 856 3 4 0.07% 0.06%

Stata 818 954 6 4 0.06% 0.07%

Peoplecode 702 444 2 1 0.05% 0.03%

Hla 625 0 1 0 0.05% 0.00%

Linden Script 623 1,695 4 3 0.04% 0.12%

D 604 0 1 0 0.04% 0.00%

Mel 587 1,022 5 4 0.04% 0.07%

Kml 531 973 1 1 0.04% 0.07%

Opengl Shader 445 406 1 2 0.03% 0.03%

Spin 399 0 1 0 0.03% 0.00%

TheRest Programming Languages � < 400 units in 2010

Lastly, the following languages sold fewer than 400 units in 2010. Here is the list in descending order: autolisp, unity, x++, cfml, inform, mysql spl, blitz3d, q, nxt, gml, pure data, javafx, rpg, cobol, nxc, minitab, ml, boo, ada, fortran, octave, jcl, racket, jsl, idl, cfscript, abap, verilog, m, smalltalk, mumps, go, windows script, egl, c/al, realbasic, bondi, cl, cs2, eiffel, ocaml, and xquery.

Next up, Post 5 will look at digital sales.

Post 5

In this final post, (Posts 1-4 are found here), I will provide a summary of the first four posts, provide some insight into a view of top Authors, and include some data on electronic books and how parts of the digital world are surpassing the print world.

Here is a quick summary of Posts 1-4.

In 2010 the book market, as a whole, was about -4.54% lower in units sales than in 2009. The tech book market was off by -6.46% in 2010, so it suffered a little more than other sectors of the industry. However, that was an improvement from the -13.05% decline the tech publishers suffered in 2009. The market continued to follow its seasonal pattern, getting off to a fast start in 2010, taking its typical nose-dive downward in July, and recovering in the fall. There were 268 more titles (from all copyright years) that made it into the Top 3000 reports during 2010, and 260 more in 2009 than 2008. This is about 3.5% more titles per year. The average units per title declined from 39.64 in 2009 to 37.92 in 2010. There were 104 fewer titles published in 2010 that made the dataset, but they averaged 6 more units per title. Again, these titles had a publish date during 2010.

Mobile is one of the main driving forces in the market. Android Programming and Android devices (for users) both had tremendous growth, and iOS and Objective-C continued to expand market share. Windows 7 is another strong influence, climbing to the market-share levels that Windows XP used to occupy but that Vista never attained. Tablets came from virtually nowhere to become a hot category at the end of 2010. I would imagine the Tablet category will continue its rapid growth, as there is an abundance of Android tablets on the near-term horizon. On the other hand, Web Page Creation, Web Design Tools, and Web Programming have all had significant declines. MacOS experienced its first decline in years as Snow Leopard did not reach the levels of previous OS releases. Both Flash and Silverlight also had significant declines, as HTML5 is poised to offer similar functionality and may already be causing their market popularity to wane.

From a publisher's perspective, O'Reilly moved into the second spot at the end of 2010, behind Wiley and slightly ahead of Pearson. The two imprints of O'Reilly and Dummies continue to have the most diverse publishing programs due to their strong performance in all six tech categories. The number one title, from a dollar perspective, was PMP Exam Prep, Sixth Edition: Rita's Course in a Book for Passing the PMP Exam and from a unit perspective, Windows 7 For Dummies. The number one programming language both 2009 and 2010 was Java, with JavaScript and VBA also showing strong growth in 2010. That's the quick review.

Now let's turn our attention to the most important ingredient in publishing � Authors. Authors are the entities that create all types of content. And there are all types of authors. Some are really like small publishing houses with "co-authors" doing most of the heavy lifting. Then there are those who do all the lifting, editing, writing, testing, and coding of the content themselves, and then move on to help promote, market and sell. These latter activities are what contribute to what we call an Author Platform. Some authors have an inherent platform by who they are or what their 9-5 job is, while others have to work hard to cultivate their platform. The most successful authors in our dataset, have figured out both the upfront creation of content, and the end-game of helping with marketing and sales. When you look at the data for the top 15 authors (basically, who has produced more units and dollars), you get the following two charts, showing lifetime sales (2004-2010).

Units Dollars

In 2010, Paul McFedries had his name on 58 different books (ranging from 2001 through 2010) that made our list, for a total of 112,152 units sold. His books sold the most units in 2010 and he had 12 new books publish in 2010, that made the list. His total was about 20,000 more units than David Pogue who saw 13 of his titles make the list with 92,000 units sold. Although Pogue averaged about 7,000 units per title, and McFedries averaged about 1,900 per title. The remaining authors that make up the the top five authors for unit sales are: Andy Rathbone, Greg Harvey, and Dan Gookin. And if you look at the top five authors from a dollars perspective the list, in descending order looks like this: Paul McFedries, David Pogue, Andy Rathbone, Rita Mulcahy, and Scott Kelby.

Electronic distribution and Sales

Now let's move past print sales in 2010 � or at least partially away from traditional channels of distribution, to discuss e-distribution. The chart immediately below shows eBooks sales in 2008. The data from The Association of American Publishers (AAP) is found here. What amazes me is the growth in scale. The scale has grown nearly 10-fold from the top of the scale being $18 million in 2008, to the top of the scale being $120 million in 2010. And the timing is just now at a tipping point because of all the tablet devices being released. If the market continues on its current growth pattern, it will be a Billion dollar business in 2013.

Click on each image to view a larger version.

Market in 2008 Market in 2010

Trade Stats_04_08.jpg

The AAP has these caveats to explain the data:

● The data above represent United States revenues only.● The data above represent only trade eBook sales via wholesale channels. Retail numbers may be as much as double the above figures due to industry wholesale discounts.● The data above represent only data submitted from approx. 12 to 15 trade publishers.● The data does not include library, educational or professional electronic sales.● The numbers reflect the wholesale revenues of publishers.● The definition used for reporting electronic book sales is "All books delivered electronically over the Internet OR to hand-held reading devices".● The IDPF and AAP began collecting data together starting in Q1 2006.

Based on these caveats, it is an understatement, in my opinion, to say that the market will be a billion dollar market in 2013.

The following two charts show Safari revenue growth and the oreilly.com direct sales mix. The reason I am showing these is because the same content that goes into our print books is available in various digital forms. Safari is a subscription service with more than half a million users. Its main focus is its B2B service, allowing developers from many of the largest companies in the world to have access to Safari Books Online. One notable difference is that the categories with consumer-oriented titles and many of our Digital Media titles, do not perform as well in Safari. Developer titles rule in Safari. As you can see from the chart to the left, our content in Safari is growing at a nice steady rate.

The chart on the right is potentially more interesting and indicative of what is happening in the market. This shows the percentage of sales on oreilly.com during 2010 for Books, Ebooks, and Video. These are big three content types with much of the same content. The ebooks are just digital versions of our print products. So the big, and more likely HUGE, news is that ebooks represent about 88% of our unit sales, and 79% of our dollar sales on oreilly.com. What is really impressive is that the growth of our digital products is moving faster than the decline of our print products. This suggests to me that we are not seeing one product type cannibalizing another � rather they are supplementing each other. The third chart shows the changing nature of our publishing program. The percentages represent each particular format and how many units we sold that year. To me, this shows the trend of what is happening. Print is slowly declining, and EPUB, Mobi, and Ebooks are skyrocketing off the chart at a rate faster than print is declining. PDF is declining and when you think about it, this makes sense. O'Reilly used to offer two type of book product: print and a PDF. Now we offer our content in virtually any form a reader would like it. So with Mobi, EPUB, and Ebooks, we are seeing the less useful PDF decline significantly.

Safari Growth for O'Reilly oreilly.com Sales Mix

O'Reilly Product Sales Mix

Again, this data is taken from direct sales for O'Reilly and oreilly.com, and may not represent the whole computer book market. I have heard from other publishers, specifically Dave Thomas at the Prags, that this split is consistent with, or a little behind, their publishing program. Smaller publishers who are growing are seeing digital products catch on quicker than print, and some of us who have sizeable legacy-print-programs are seeing a faster ramp-up of digital products than the decline of our print programs. It is an interesting notion to talk about a print program as 'legacy', but that is truly the best description. How's that for the changing nature of the computer book market?

Thank you for reading these posts. If there is something that you are itching to see [understand more clearly], please let me know and I will try to help. I plan to excerpt updated pieces of these posts on Twitter throughout the year. They'll come from @mikehatora and will likely get retweeted by @oreillymedia.