Article++ (Building a learning organization – David Garvin)

The best place to find this article is at the ‘Harvard Business Review on Knowledge Management’ series of articles published in 1990-1991. Interestingly, this article follows one on a similar theme by Ikujiro Nonaka which is quite contrary to what David Garvin suggests in this article.

While Nonaka focuses on implicit knowledge, cultural knowledge and the role of an enterprise wide motto to foster the knowledge creation and learning process, Garvin attributes standard processes and measurable indices to ‘calculate’ the learning process to be necessary for building a learning organization. He further critizes the approach by Nonaka that a system without a proper check and balance is difficult to introduce in place and later on manage to an extent that it becomes an inherent corporate culture.

Garvin suggests that a learning philosophy like that of Nonaka is quite abstract and ideal and lacks an operational plan to carry it out. He suggests an alternative model and provides his idea on how to develop SOP guidelines. to build a learning organization. But before presenting his idea, he stresses the need of defining a proper framework in place to develop our ideas such that various idea strategies can be comparable on some scale.

The Framework:

Garvin suggests a framework of 3M’s : Meaning, Management and Measurement. The idea being that since ‘learning’ is still an elusive concept and there is a lack of best practices to govern a learning organization and then finally there is no unified and quantifiable approach to measure any learning strategy, therefore, the foremost problem is not to bring forth these ideas but to develop a framework within which these ideas can be built and tested.

Garvin points out that an organization should set out first to define what ‘learning’ means to it, what are its goals of ‘learning’ and what outcome is expected from ‘learning’ . Once these different aspects are better understood, an organization can safely claim to have a clear ‘Meaning’ of a learning organization. After which a concrete SOP has to be implemented such that the learning process is standardized across teams and across people within the organization. This he terms as ‘Management’. Thirdly, there has to be some tools available to measure the outcome of the learning procedures such that the system can have a self diagnosis and can tune itself to the directions suggested by the measuring indexes. Thus, a ‘Measurement’ has to be clearly defined. Once these three things are in place, one can design an idea to build a learning organization.

The Idea:

After devising a framework, Garvin develops an idea within the 3M’s framework by defining learning as:

“A learning organization is an organization skilled at creating, acquiring, and transferring knowledge, and at modifying its behavior to reflect new knowledge and insights.”

But having said that, he points out that without actions being associated with learning, there is no learning, that is, according to Garvin, learning is ‘learning by doing’ only and it also is the major instrument for employee motivation. If people see that their learned concepts/ideas/trade/skill are being practically implemented, then they feel motivated to further learn and improve. According to him, a learning management system must have a:

  1. Systematic problem solving method which ensures consistency and quality. Such a system is data driven and scientific.
  2. Experimentation by either ongoing programs which provide incremental knowledge gains or by demonstrations or prototypes which provide a holistic update to the knowledge creating process.
  3. Learning from past experience enables one to stop repeating mistakes through lessons learned and develop a best practices knowledge base to improve work. With such a system in place, a failure becomes a productive failure as it brings insight into the product or process.
  4. Learning from others like the ‘not invented here syndrome’ or the SIS idea (Steal Ideas Shamelessly).
  5. Transferring knowledge through flatter organizational structure, corporate knowledge repositories and by placing a smartly designed incentive system.

Among measuring schemes, conventional methods like ‘learning curves’ and ‘manufacturing progress functions’ have their shortcomings thus Garvin suggests using the ‘half-life curve’ as a suitable tool to measure a learning strategy.

Finally he discusses the high level process of building a learning organization which starts from a cognitive phase where new ideas are exposed and are digested by the people followed by a behavioral phase where these ideas are put to use and finally a process improvement phase which alters behavior to improve any KPI, quality, or efficiency. And the start of these processes is triggered only by the creation of a learning environment.


Article++ (Teaching smart people how to learn – Chris Argyris)

Chris Argyris brings forward a very useful article which I believe to be one of the best articles I’ve read on the subject not just because it gives some new phrases like ‘single loop’ or “double loop” but because the insight he delivers is very realistic and personally, I found considerable improvement in my practical experiences.

What is Learning? How to learn? Argyris does not start off by forwarding his doctrine on these questions but rather focuses on human behavior and our learning patterns. On an empirical justification, he points out that learning is the process of solving problems out of one’s comfort zones of presuppositions and beliefs. According to him, failure is an important anecdote to better learning in the sense that those who have never failed, have never seen an uncomfortable or unforeseeable situation and are not equipped to handle under distress. Another reason for people failing to learn beyond their routine tasks (termed single loop learning) is the difference between their perceived behavior and their actual behavior. By nature, humans tend to blame the environment and contradict themselves when the perceived and the actual don’t match.

Argyris then explains why people avoid learning (double loop) and attributes the fear of failure and over ambitious goals to be the reason. People never set average standards, they set the best landmark to achieve and often fail to achieve them. Yet subconsciously, people deny these failures to their own setting of high standards but rather to the external environment. The fear of failure stops people from adventuring outside their comfort zones and into self analysis which hurts the ego.

This thesis is very well backed up by empirical human psychology and provides some tips to avoid locking oneself up into a single loop learning situation. In Statistical Learning Theory, we call a learning model is over fit (and hence, not acceptable) if it becomes too hard lined on its belief based on past knowledge. A flexible learning model, one that has not ‘converged’ and stuck in a local loop is sought after.

Article++ (The knowledge-creating company – Ikujiro Nonaka)

Japanese Management is well known for its exotic uniqueness to bring about success in organizations. Many of the common principles in Japanese Management appear alien to roughly speaking the Western school of Management. In the context of Knowledge Management, there are several new ideas based on demographic traditions besides other factors in the Orient. Nonaka starts off this article by focusing on the importance of leveraging tacit knowledge. There is a strong emphasis on corporate knowledge culture and the presence of an enterprise knowledge embedded in its human resource.

However, such a philosophy to harness tacit knowledge from employees is a far greater concept to implement keeping in mind the inherent challenge to explicitize it. However, Nonaka together with Takeuchi visualized such a process known as ‘The Spiral of Knowledge’ or the SECI Model.

The Spiral of Knowledge

In this article, Nonaka further reinforces the spiral model of knowledge flow and an important question which arises in the context of a knowledge creating organization is what defines knowledge creation? And where does knowledge originate from?

According to Nonaka, ‘new knowledge begins from chaos’, and chaos here is further elaborated to be a controlled tool by senior management to invoke the thought process within employees ‘against’ an ambiguous or vague corporate metaphor. According to him, an organization should float around a metaphor which defines a motto for the organization but at the same time signals a vague or ambiguous concept which encourages employees to think about the overall process and try to make meaning out of it. He explains that any corporate motto holds a specific and different meaning for each employee due to the different functional contexts of the worker. And this differing meanings brings the essence of debate and questioning which ultimately leads to a creative process. Nonaka strengthens this hypothesis by quoting several examples from notable organizations which employed similar tactics and were successful in creating not just new knowledge but improving creativity of their employees. More specifically, knowledge creation for Nonaka is the transfer of tacit knowledge into explicit knowledge by utilization of tacit knowledge into an innovating process.

Besides metaphors, Nonaka also advocates the use of analogies if an organization finds metaphors to be more confusing for her employees. In his words, “An analogy is an intermediate step between pure imagination and logical thinking”. Sometimes these analogies are an evolved form of a metaphor, further fine tuned into more standardized, less context sensitive meaning and thus abstracts the tacit knowledge which the SECI model termed as ‘Conceptualization’. Organizations which have a fairly well defined and successful analogy in place lead the way to a model which represents the tacit knowledge of an organization in place.

One notable difference between Nonaka’s theme to that of Drucker’s is redundancy. Where Drucker feels that redundancy is a mere burden and a bottleneck and is usually a byproduct of too many layers of middle management, Nonaka finds redundancy to be a chief component to harness tacit knowledge from internalization through to conceptualization if visualized in his Spiral Knowledge Model.

Redundancy brings organizations to experience a concept in different contexts and by different functional teams. Thus a holistic view of the concept is created. Although redundancy does not have to be the way Drucker or others despise it, it creates a “common cognitive ground”. Such redundancy can be created by frequent job rotation or by common access to corporate knowledge base where the job of knowledge creation is not confined to particular teams or individuals but is everyone’s responsibility. However, according to Nonaka, this does not mean that there are no specific job roles within a knowledge creating company, but knowledge creation is not confined at a single locaiton.

Bottom line is that knowledge management is all about managing tacit knowledge and communicating it across the enterprise in an endless repetitive cycle.

Article++ (The coming of the new organization – Peter Drucker)

This article is a fantastic tool to boost the emerging KM trade industry for especially those in the developing world who stand against bureaucracy by enlightening them of the necessary changes required to not just sustain 21st century business pressures but to lean forward towards an innovating organisation. Peter Drucker is a household name among old school managers besides the younger lot and when he points out the changing landscape of business organizational setups, demands and behavior, these old timers just have to listen.

Drucker associates the coming of the new organization with the advent of data processing technologies available, which although are not a prerequisit for an information based organizaion but without it, a setup can heavily risk drowing into a ‘swamp’ of data. How data processing tools have transformed an act of diagnosis into anaylsis bridges a huge gap between innovation and business operations. Such organizations which foster information turns every business issue into an opportunity, risks are precalculated and business decision success rates soar up.

But with the cavear comes the catch for any organization to parallel itelf as an information based organizaion. Organigrams have to be redesigned, with very little middle management layers, flatter structures and a shift of knowledge from upper echolons to front line workers who are also transformed from mere human hands to knowledge workers capable of working in cross-functional and sometimes cross-domain teams.

The way Drucker outrightly insults the existence of middle management pointing out that it merely plays a role of information relay which by the advent of better technology and better awareness of frontline workers is no longer required, sends out a very strong signal that (middle) management infact becomes inefficient if not a bottleneck.

This also brings about massive change in mindsets of people and their careers. In the coming of the new organization, where specialists are catered, there will be far fewer opportunities to jump to ‘management’ simply because of the lack of a substantial middle management layer. Secondly, in a knowledge intensive society, progress is deemed not by the promotion to management but by the specialization of knowledge and knowledge-based achievements within one’s domain.

Organigrams will also have time driven variations by the introduction of task forces instead of dedicated but sparsely connected departments and divisions. These task forces will be formed by a variation of domains, specializations and functions. Teams will be highly communicative and will participate in the entire lifecycle of an operation as compared to the traditional approach where one domain specialists/functional department becomes a major stakeholder at a particular timestamp in the operation’s lifecycle. These temporary task forces improves the organization’s cost efficiency and encourage the organization to carry out more frequent task forces which points towards an innovation friendly organization.

The bottom line that when workers start taking the bigger picture in mind by abstraction and secondly figure out the information flow between units, they take the track towards an information based setup. Such information flows exist between superiors and subordinates but according to Drucker, most of the information flow will happen within colleagues and vertical functions thus enabling greater abstraction and holistic view for these knowledge workers.

Multi-Relational Learning

In many real world domains, hidden information is present in the inter-relationships between different classes within the data. This information can be relational, hierarchical or conditional in nature. Most of the times, this information is implicitly codified while designing the data schemas for the problem at hand. While data mining, all such schemas are denormalized since conventional data mining algorithms work on single table structures at a time. By denormalization, the implicit relationships present in the original schema are lost and thus, data mining starts off by losing valuable information.

To overcome this problem, data analysts denormalize data in an interactive fashion using their background domain knowledge to preserve data inter-relationships. However, challenge lies in fully automating this process and that is where the emerging field of Relational Data Mining appears. For a comprehensive book, check out. Relational Data Mining by Dzeroski and Lavrac.

Domains where RDM (Relational Data Mining) is holding great potential include bioinformatics, social networking, viral marketing, natural language processing and text mining to name a few. The inherent nature of all such domains is high data dimensionality, catagorical data and data that can be represented as a graph structure.

In high dimensional data mining, the main problem is the sparsity of feature vectors constructed. And the learned feature vectors tend to be larger than the orignal data set if the data is also catagorical in nature. A naive approach under such an environment is to try to superimpose data as a normal distribution but this is not a robust strategy. Adding on to this, in certain domains(like bioinformatics), it is quite often a problem for the data miner to fully understand the nature of the domain and thus there is a tendency to miss out important relationships while preparing the data for analysis.

The essence of RDMs is in an expressive language for patterns involving relational data, a language more expressive than conventional (propositional) data mining patterns and less complex than first order logic. Inductive Logic Programming (ILP) in context of KDD provides the language sometimes called relational logic to express patterns containing data inter-relationships.

There is a counterpart relational data structure for many data mining tasks, for CART, there is S – CART, for C4.5, there is a relational version called TILDE. Similary, there are relational association rules and relational decision trees which are build on the notion of a relational distance measure like RIBL.

However, even though Multi-relational learning holds promise, the field is still far from being able to generalize methodologies for the whole spectrum of data mining problems. The field of Statistical Relational Learning, as it is sometimes coined holds onto an assumption that models built over apparent data and relational data (within it) yields better results than models built over only apparent data. This however, as pointed out by [2] is not the general case and in certain data sets, only the intrinsic (apparent) data provides better models compared to those datasets containing relational data too.

Secondly, due to the inherent complexity of relational data, it has been observed that deterministic relational learners don’t produce as good results as probabilistic relational learners. Statistical relational learning accurately predicts structured data and is able to chalk out dependencies between data instances which have been ignored a lot in previous machine learning setups.

Besides the nature of the data sets, relational learning algorithms have also developed various approaches in solving the problem. Earlier relational learners concentrated more on propositionalization of relational data into ‘flat’ data and then applying conventional learners to it. However, recent tactics involve incorporating the relational data schemes in the learner’s framework directly. [1] However, both approaches continue to progress.

Thus, the field of relational learning is gaining wide acceptance and suitable methodologies for applications in general fields are being devised.

Efficient Adaptive-Support Association Rule Mining for Recommender Systems – Lin, Alvarez, Ruiz Kluwer 2001

This paper deals with online collaborative recommendation ASARM by focusing on only those association rules pertaining to a particular user or article at a time. The algorithm also provides enhancements by introducing a heuristic to adapt minimum support for association rules to be generated and instead of using a confidence threshold, uses a range of rule size instead.

Among several other competing techniques, it scores better in terms of both quality of results and efficiency of rule generation thus making it a convenient approach for online collaborative recommendations. Among the techniques, Shardanand and Maes (1995) and Resnick et al (1994) . propose variants of linear (spearman) correlation has been used as well addressing the same problem but it generates only linear relationships between rules and thus misses out on apparent but non-linear association rules. Breese et al have used Bayesian Networks but the problems with this approach is that a prior conditional probability for each rule has to be calculated which is an expensive operation. Secondly, the quality of induced rules cannot be measured. Billsus and Pazzani have used Neural Networks based on feature reduction schemes including Singular Value Decompostion (SVD) and Information Gain. The space is reduced to a lower dimensionality and then neural networks are used to create the recommendation model.

For user associations, all rules which are in the rule size range and hav a minimum support in the data are fired. For article associations though, besides the minimum support, a score is also associated with each article and for a rule to be fired containing a particular article for head, this score should be above a minimum threshold value. This score is the sum of the scores of all those rules for which this article is fired. The score of a rule is the sum of the product of support of the rule and its confidence.

After empirical tests on a commerical online movie recommendation data, the authors come to the following conclusions:

The method is slightly better in terms to the neural networks based approaches proposed by Billsus and Pazzani. For collaborative recomendations, if a user’s target minimum support is too low, it takes a long time to induce the rules and the quality of the induced rules is also deteriorated. For such a case, use article associations instead of user associations in case user associations minimum support is below a threshold otherwise use user association. With this heuristic, the possibliity of inferring new users and new articles for whom minimum support is always low. For majority of users, only a few calls to the main association rule algorithm ASARM2 needs to be called as compared to the multitude of times it has to be called in the conventional A Priori algorithm.


“Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever.”
– Oliver G. Selfridge, in AI’s Greatest Trends and Controversies

If an expert system–brilliantly designed, engineered and implemented–cannot learn not to repeat its mistakes, it is not as intelligent as a worm or a sea anemone or a kitten.
-Oliver G. Selfridge, from The Gardens of Learning.

Both quotes taken from the AAAI page on machine learning. Quite inspiring arent they, this gets me motivated now to begin my machine learning assignment on ID3 decision tree learning.
so adios for now!