Text Mining for the Brand Intermediaries – a CGM Whodunnit

A Colloquial Treatment of the Product Performance and Outcomes Aspect of Mining of the Blogosphere for Brand Services Monitoring.
Alan Wilensky
The author’s recently posted monograph, “ Employing Advanced Natural Language Text Processing to Provide Guidance to Mid-Market Multiline Dealers and Product Distributors”, has generated a great deal of blog traffic, but some have asked for a more brief and colloquial treatment.
The following is a basic description of where the industry is, or has, led itself, and where the author thinks the industry should be going, based on research recently conducted for a telecom industry client. The resulting conclusions do not betray any verbatim strategic conclusions that the author provided to the client.
The Text Mining Industry, and in particular those specializing in brand monitoring services, have come to focus on ‘sentiment’ as the metric du jour. We all know the typical meaning of sentiment, it’s how we feel, or how we express how we feel. The problem occurs when we create machine scored metrics of a very human thing, such as sentiment, and expect the mined results from the blogosphere to make sense.
Other than the linguistic problems of generation (That is so bad [in one generation] means bad, in another it means really great – Pimp, is (a derogatory) in one generation and quite amazingly, good in another [ we be each other’s pimp] (advocates), [pimp my ride] (make fancy)). One can find even more subtle examples of how linguistic subtleties might confound an algorithm. There are many more robust social media metrics that far exceed the reliability of sentiment, or that can augment sentiment in order to strengthen the ultimate guidance that is being sought via the mining of the corpus in question, i.e., the blogosphere and/or public user forums.
But, linguistic problems aside, it is the role that the early text mining sector entrants have cast themselves in as ‘brand monitoring surrogates’, that is really a point of contention. Who is concerned with monitoring brand? Why, it’s the brand owners of the Fortune 1000. These large companies, mostly in the durable goods sector (for who blogs about toothpaste and other consumer packaged goods?), have had unfettered access to the best brand monitoring and consulting practices. AC Neilsen, Arbitron, and Gallup are the giants of this industry, but there are others.
Creating text mining services for the Fortune 1000 has been a high latency, fussy business. Whereas professional brand monitoring is based on actuarially sound models and standardized methods of sampling, these recently innovated ‘social media text mining services’, often have account reps, and sometimes, computational linguists (gasp), work with clients to identify verbiage that that does, or does not, express sentiments concerning products of interest and brand issues of concern. But there is a problem:
Catering to these brand owners is, as previously stated, a high latency business; contracts from the leaders sometimes take 2-4 weeks in the sales cycle, and 2+ weeks to setup the query and dashboard reporting. Furthermore, brand ownership is limited to the relatively small group of brand owners who are used to sampling brand awareness, and regularly avail themselves of brand equity practices, such as consulting. Such brand consulting businesses often make their nut by getting a percentage of the ad buy. This brave new world of ‘text mining of the blogosphere’ is a curiosity that has not made significant inroads into the brand monitoring business – maybe to the tune of a very optimistic $100M, compared to the entrenched brand services billions. There is also ample evidence that the clientele served, and the investors at equity in such new age ventures, are tiring quickly of the model and results.
So, what is needed for text mining of the public corpus to succeed? First of all, to turn the attention of these services from the brand equity owners, to the branding recipients – those who must deal with the customer’s perception of branding, product performance outcomes, and interactions with the entire spectrum of the product’s touch points – service, warranty, dealerships, etc.
Who are these prime recipients of brand decisions? Customers, certainly, but from the point of view of a web based service to analyze the public corpus, the true targets are the brand intermediaries. These intermediaries are multiline retailers and distributors that span the gamut of local shops, regional retailers, and national department stores and distributors. These are the true recipients of branding decisions, and have had very little guidance to steer their decisions as to whether or not to add or drop product lines, take advantage of ‘spiffs’ (incentives that cover cooperative advertising and floor-plan financing), or any decision affecting what brands to carry and promote.

Continue reading


A New Service for CGM

Consumer Generated Media Metrics Services:

Employing Advanced Natural Language Text Processing to Provide Guidance to Mid-Market Multiline Dealers and Product Distributors.

Alan Wilensky
Executive Summary
The present rage over such issues as ‘sentiment analysis’ and text mining of the public corpora is fertile ground for a fresh and focused analysis of the state of the CGM industry; such an analysis was recently undertaken by the author.  The author’s privately commissioned report answered various questions, such as:

what is the state of the industry?
are early entrants employing a sustainable business model?
what is the traditional / entrenched competition?
what has been missed, where is the prime the opportunity?

Without betraying the verbatim text of the 90 day analysis compiled at the behest of the previous client, the author would indeed like to share the non-proprietary conclusions leading to strategies for future products and services.

The most abstract and brief statement of the analysis is that current CGM services being proffered to the marketplace by the early entrants (Cymfony, Buzzmetrics) are purely focused on Brand Owners, occupying positions chiefly within the fortune 1000. The services provided by these leaders are positioned against traditional brand equity services (provided by AC Nielsen, Arbitron, and Gallup). Cloning the established recurring campaign model, and offering such services to a limited cadre of potential clients (that already have access to repeatable data from entrenched leaders) is not a formula for success.

The author’s commissioned analysis for the previous client bears this out by verifying statements from some of the equity investors involved in early CGM ventures. Such bitter regrets led to the sale of Cymfony to TNS – the very nature of the acquiring entity1 is a repudiation of the brand equity campaign model being applied to Consumer Generated Media metrics.

Sentiment is the weakest of CGM metrics, and the notion that highly customized campaigns that are iteratively refined by the client and CGM agency, is simply unsustainable when compared to services provided by the multi-billion dollar brand equity leaders deriving solid, actionable data from surveys, focus groups, and other statistical sources.

Furthermore, the attribute  of brand ‘equity’, or ownership, is truly limited to a very elite few, when viewed within the total business opportunity matrix. Therefore, the few high latency campaigns that have been sampled by the brand owners, as replacements or adjuncts to existing brand equity services, are simply not making a sustainable impression.

The author believes that the real, sustainable market for CGM analysis lies in the mid-market – those companies that are recipients of branding decisions and who make the daily decisions as to what lines to carry, and which products to drop; we are truly speaking of the multi-line retailers and product distributors that most of us deal with regularly in our personal and professional lives.

The mid-market makes up the lion share of commerce decisions; while the fortune 1000 CPG and durable goods markets toy with Sentiment Analysis, the mid-market struggles with allocation of limited dollars in actual cash and lines of finite credit. Which line shall we carry or drop, what action shall we take in light of incentives taking the form of cooperative advertising and floor-plan financing contributed by manufacturers and up-stream distributors? In short, what is the actionable intelligence we can gain from any service that can steer the ship of commerce?

The answer is to create this actionable market intelligence from multiple text corpora (for statistical accuracy), and to employ a greatly extended model of linguistic metrics based on phrasal Ontologies which detect consumer issues, outcomes, and declarations. Such issue detection methods result in real, actionable metrics. These metrics lend themselves to statistical scrutiny, and may be charted against offers from the brand owners – such as the aforementioned contributions of advertising cooperative program dollars, as well as seemingly advantageous adjustments to a multi-line dealer’s floor planning finance carrying charges.

With such a service in place, dealers and distributors who must make constant corrections to their product catalogs and inventory levels, etc., may avail-themselves of a more dispassionate decision support methodology. Such a hosted services solution can be monetized in many ways, i.e., by subscriptions, at varying service levels as a free or ad supported service, or as an up-sell generator for  hosted CRM services.

By Linking the aforementioned perceptions and “statements of outcome” with variables that are derived from the decision matrix data important to the mid-market (such as cooperative advertising, floor-planning, and inventory financing incentive programs), such a subscriber based system can be created to offer guidance to multi-line dealers/distributors in regards to purchasing decisions, and weighing program incentives against overall market perceptions.

The component technologies that must be developed for this system are sufficiently novel, such that revenues from licensing, syndication style Web 2.0 widget embedding, and turnkey system via VAR are entirely possible as ancillary revenue streams in addition to offering a comprehensive, hosted solution.

The author is of the firm opinion that such a system is within reach, given the appropriate project management, sound architectural principles, and the application of creative thinking diligently applied to crystalizing superior solutions.

Continue reading

Predicting Advanced Outcomes

Balanced Services Rationale of the Universal Taxonomy of Customer Services and Product Performance Outcomes

A bi-corporal stochastic method of customer service interactions and product performance outcomes.

Alan Wilensky, Analyst, vCastprofiles
Charles H. Martin, Ph D.

Analyst’s Steering Notes

As if marching not only in lockstep, but in apparent synchrony, the seed players of repute within the very young CGM metrics industry have all entered the virgin sector as brand service aspirants. Such an obvious strategy seems logical on its face, but upon further examination, fails in thought and deed. As explained in the accompanying report, the brand services industry is well established, provides repeatable metrics, and is the province and preference of internal brand managers. Although classical brand services are certainly not barred from extending CGM services to their proven sampling sets, and may be doing so, the performance of the early sector leaders has hardly made a dent in current practices.

The balanced services offering for CGM seeks to redress the major flaws of current CGM practice by providing a bridge between Business Intelligence mined from call centers, warranty, support systems, and surveys, against in-the-wild metrics from the public corpora. By reconciling the topic chains between these two data sources, we are able to create a more rationally scored set, and derive (over time), modular ontological chains that remove the very ambiguities found in the current state of CSA. Further expansion of the metrics beyond sentiment (the weakest and most subjective of the variables), will enhance the evolving accuracy of the product’s output.

In the industry’s short history, there have been hints of a developing interest in cross verified linguistic metrics. Some of these early methods are purely statistically based, while others are based on the wide body of research using monolithic ontological models; applying any research, either adapted from academic research or one that is wholly original will be a non-trivial undertaking, as the final offering must deliver an actionable service across the value chain from the top brand owners, through the mid-market, and even to the smallest dealers in a product network. Catering to the mutually overlapping desires of the entire value chain is an application well suited to modular ontological models, and the hybrid statistical verification of multiple sources.

Fancy words aside, the problems faced by early sector entrants have been manifold, in regard to the extension of the measurement model, metrics, and especially, reconciliation against corporate BI data warehouses. As most of the visible leaders are venture backed, they have had to ‘stick to the knitting’, making CSA pay no matter how weak or unsustainable the model. But this missive is not a competitive or sector commentary, it is a rationale – however, it is important to point out that the evolution of a balanced services model is a resource intensive undertaking, requiring fairly extensive test regimes that employ CS and CRM system mining, as well as sophisticated linguistic modeling. Such a research prerogative is the domain of the well-resourced, not venture backed entities with a time to harvest bias.

Telco’s and media conglomerates are entities well suited to evolving the balanced model; the size of the company, services portfolios, and global relationships open the door for a full vetting of the concepts of reconciliation, concomitants, and the evolution of modular Ontologies for cross value chain issue detection. Such a product offering is deep, expansive, and could sweep the sector as the normative method of providing real-time, actionable market intelligence. Continue reading