Google’s impact on enterprise content management

Without a doubt Google has had a huge impact on the enterprise perspective on content management (ECM).

The pluses and negatives were highlighted by two blog posts yesterday:

On the plus side, John Mancini of AIIM listed three, "fundamental assumptions about information management that affect the ECM industry," in his "Googlization of Content" post:

  1. "Ease of use. The simple search box has become the central metaphor for how difficult we think it ought to be to find information, regardless of whether we are in the consumer world or behind the firewall. This has changed the expectations of how we expect ECM solutions to work and how difficult they are to learn.
  2. Most everything they do is free...
  3. They have changed how we think about the "cloud." Google has changed the nature of how we think about applications and how we think about where we store the information created by those applications. Sure, there are all sorts of security and retention and reliability issues to consider..."

On the negative side, Alan Pelz-Sharpe made a post today in CMS Watch titled, "Google – unsuitable for the enterprise". Alan introduced his piece by saying:

"For years now Google has played fast and loose with information confidentiality and privacy issues. As if further proof were needed, the PR disaster that is Buzz should be enough to firmly conclude that Google is not suitable for enterprise use-cases." He went on to say, "It is inconceivable that enterprise-focused vendors... would ever contemplate the reckless move that Google undertook in deliberately exposing customers' private information to all and sundry with Buzz."

Google is a hugely successful company, and they are extremely profitable. However, they are not a software company. Fundamentally they are an advertising placement company and everything they do is motivated by maximizing advertising revenue, whether directly or indirectly. 99% of their revenue comes from advertising that pays for every cool project they do and every service they offer.

While Google services to consumers have no monetary charge, they are not free:

  • You agree to accept the presentation of advertisements when you use Google products and services; most people believe these to be easily ignored despite the evidence of their effectiveness.
  • More importantly, you agree to offer provide information about your interests, friends, browsing and search habits as payment-in-kind. Mostly people sort of know this, but don't think about it. If you ask them whether they are concerned that Google has a record of every search they have ever performed, they start to get uncomfortable. I expect most of us have searched on terms, which taken out of context, would take a lot to 'explain.'

While most consumers in democracies are currently cavalier about issues of their own privacy, enterprises most certainly are not. Indeed, the need for careful management of intellectual property, agreements, revenue analyses and a host of other enterprise activities captured in content is precisely why they buy ECM systems.

The furor over Buzz points out that Google did things first and foremost to further its own corporate goals, which clash with those of other enterprises.

In contrast, Google's goals require it to align with user needs, especially for good interfaces. An easy-to-use interface encourages and sustains use. That ought to be obvious to everyone, but when the effects of the interface on usage are easily measureable and directly tied to revenue (as in the case of Google Search), it becomes blatantly and immediately evident. In contrast, the development of an interface for an enterprise software product may take place months or even years before the product is released. Even if detailed usability research is done with test users, and in-depth beta programs are employed, the quality and immediacy of the feedback is less.

Besides easy interfaces, enterprise content management users expect 'Google-like' search, and are disappointed. There are generally two reasons for this:

  • Search results have to be further processed to determine if a user can be presented with each 'hit' based on their permissions
    • Typically 70-90% of the total computational time for enterprise search is taken up by permission checking
  • Enterprises don't invest as much in search infrastructure as they should if the rapid delivery of search results was seen as critical

The second point is probably more important than people admit. In my experience significant computational resources are not allocated to Search by IT departments. I suspect that they look at average resource utilization, not peak performance and the time to deliver results to users. To deliver the typical half second or less response that Google considers to be essential, hundreds of servers may be involved. I am not aware of any Enterprise that allocates even the same order of magnitude of resources to content searching, so inevitably users experience dramatically slower response times.

In summary, the alignment of optimal user experiences with Google's need to place advertisements has advanced the standards of user interfaces and provided many 'free' services, but the clash of Google's corporate goals with the goals of other corporations has shown that the enterprise content has value that is not likely to be traded.

Syndicated at http://conversations.opentext.com/


The ‘Second Coming’ of Renditions - Video

Long time ECM veterans will remember the concept of document rendition – a transformed alternative. I think we'll see renditions again.

A rendition is essentially another form of a specific version of a document. There are two common types of renditions based on format and content:

  1. The same information content as the original document, but a different file format
  • For example, a spreadsheet file can be renditioned as a PDF
  1. The same file format as the original document, but different content
  • For example, a MS PowerPoint Document written in English can have a rendition that is also a PowerPoint file, but whose content has been translated into French

Renditions for limited bandwidth in the 90's

In the 1990's, one of the common use cases was to deal with the limited bandwidth available at the time. It often took a long time to download and open a document just to see if it contained what you were looking for. Accordingly, Open Text Livelink automatically made HTML renditions of many common formats such as MS Word that were much smaller files and so could be downloaded much faster for quick review.

I remember presenting the use case to customers: "If you want to look quickly at a file without opening the full thing..." Back then bandwidth was so limited it made sense. Now it seldom does, although there are specific use-cases like renditions that contain added content like secured signatures that still have value.

Bandwidth issues are back

Bandwidth is becoming limiting again – not for 'simple' text documents, but for rich media files such as videos. In fact bandwidth issues are so acute that the shape of the Internet has changed radically in the last few years. The explosive growth of video sharing has lead to the rise of Content Delivery or Distribution Networks (CDN) such as Akamai Technologies, Limelight Networks, CDNetworks and Amazon CloudFront to enable effective distribution.

Akamai recently claimed they handle around 20% or the Internet traffic by volume – most of this traffic is rich media which must be delivered very quickly as users expect pages to load extremely quickly even if they contain a video. A recent Forrester report says the expected threshold to load has become two seconds.

For video files to be useful to end users they have to start to play almost instantly. This is usually achieved by:

  • Locating a copy in close network proximity to the end user
    • CDNs use many distributed sites around the 'edge of the Cloud' to ensure that is at least one site close to an end user preloaded with files that are expected to be required
  • Reducing the size of the video through transcoding and compression
  • Streaming – starting to play before all of the content is received
The increasing use of mobile devices with narrow and unstable bandwidth connections, and different format requirements creates further hurdles to serving users rapidly.

Enterprise needs

So what about the enterprise or corporate user? Trained by the web, he/she expects to click on a link and have a video start playing within two seconds. But most internal ECM systems (e.g. for document management) are designed to download a complete file before it is available to the end user.

A story – Here's a scenario I experienced recently. A Finance department prepared a new expense form. To show staff how to use it they prepared a five minute video. The trouble was that their WMV format video was over 300MB. For most staff in a global company, especially remote staff, downloading a 300MB file to view it is just not practical. What Finance needed was to be able to upload the video, and have the system take care of making a rendition that was transcoded and compressed, made stream-able and hosted on a CDN.

There are just too many manual steps and too many options for most newcomers to video creation. Systems should take care of most of those steps. And one excellent way to execute several steps is to have the ECM system create a rendition of a deposited video that contains embed code to start a player and stream video from a CDN. The consumer users can then simply click on the object name in their ECM system and a streamed video starts to play almost instantly – as they have come to expect with sites such as YouTube.

So renditions have a place in the new enterprise again to deal with bandwidth limitations!

Syndicated at http://conversations.opentext.com/