“Not just GenAI” Leadership Series — Part II
This is part of a series of bite sized quick-reads that will equip leadership levels with the requisite information around GenAI and related areas, enabling you to question proposals and make more informed decisions around such investments.
These posts will take a balanced approach — talking about not just the strengths of these technologies, but also areas to watch out for. These topics would intentionally be presented in a simple discussion-like manner, rather than a dry/abstract third-party tone…Let’s dive in.
“How to find enterprise use cases that can leverage GenAI?”
The above question would be akin to asking,” What problems can I solve with a screwdriver?” rather than asking, “What are the prioritized list of problems that need to be solved, regardless of whether you use a screwdriver or a hammer or something else for the solution?”
Why look for use cases that leverage GenAI? Why not prioritize your business problems first, and then, if some of those problems can be solved by GenAI, so be it.
This is not just theoretical semantics. It’s a fundamentally incorrect way GenAI is being approached in an alarmingly large number organizations, that can lead to misplaced investments.
Yes, GenAI is powerful, but it is not a panacea. It’s just one part of a bouquet of Digital Business components that can work together to elevate your business, provided the time-tested business principles are adhered to.
“If a business problem is already defined and prioritized, how does one check if GenAI is an appropriate solution for that?”
Now that’s a good question. You wouldn’t want to start with GenAI as the solution option. First of all, check if regular business rules can solve the problem. Such rules are deterministic. Meaning, you can predict their behaviour. You know what to expect, because you (and not the machine) defined those business rules.
So, if business rules can solve the problem at hand, look no further.
Having said that, there may be problems where such deterministic rules may not work. This could be because you do not know what rules will work. Or, it could be because the underlying business and data conditions change so rapidly that the rules defined today could get obsolete very quickly.
In such situations, the next best option is to look for applying Machine Learning (ML) techniques – either Discriminative or Generative. A gist of the 2 approaches is given below, but more details are presented in Part-I of this series.
Discriminative AI/ML is generally a better option, if:
- The data is structured, meaning, it is available in rows and columns (like an Excel sheet)
- There is not much ‘free text’ in the data, like language sentences.
- You have a fairly large number of rows, the larger the better.
- The problem is a supervised one (details below)
Note: If there is a non-trivial amount of free text in the data, then you’d need to convert that into structured, using a GenAI technique.
Generative AI/ML is generally a better option, if:
- The data is unstructured, in the form of text, images etc.
- The data is structured, but you have a small number of rows
- It’s an unsupervised problem (details below)
- When the number of rows is less, Discriminative AI/ML struggles to give good results. GenAI is better in this case.
- When the number of rows is more, then Discriminative is normally better than Generative.
- The above points are general guidelines, and not absolute rules. For example, there are instances where you would prefer to use GenAI for Supervised learning.
Supervised, unsupervised and semi-supervised learning
In the above section, we saw references to these terms. Here, we’ll take a quick look at what they mean.
You are the Head of the GRC (Governance, Risk and Compliance) team at a leading Bank. Your team has collected historical data about credit card defaults in the past. It’s a good set of data, which contains information about not just defaulters, but also non-defaulters. Every entry/row has variables like employment status, age, gender, credit card limit etc, and the corresponding value of ‘defaulter or non-defaulter’. This last value is called the target variable. This is what you need to predict, when a new applicant comes in.
Your Data Science team applies algorithms to build models which try to correlate the other variables with the target variable. In other words, this target variable belonging to the historical data acts as a supervising element in the Machine learning process. Hence the term, supervised learning.
Here’s another (very important) way to look at it. The Machine Learning algorithm’s job is to find out probability of the person being a defaulter, given the other variables.
This is denoted as P(Defaulter|x), where ‘x’ is the full set of variable values for that row.
|This is called Conditional Probability, and is the fundamental identifier of a Discriminant AI/ML approach.
In these types of use cases, there is no target variable. Hence, Unsupervised. The job of the algorithms in this group, is to find patterns in the data. Clustering is a good example of Unsupervised learning. Again, note that clustering is not done basis any target variable. As with supervised algorithms, there are various techniques/algorithms for clustering, but the common job for all of them is to look at the data and break that into clusters.
Clustering is just one example of Unsupervised algorithms. Another good example is this — you want the algorithm to look at some images and create more like those, within the bounds of the current set of images. For e.g., if the current set of images are human faces, then the ears of the new set of images should be within the bounds of regular human ears…the algorithm should not put elephant-size ears on a human face. This is where a good GenAI algo, which studies the probability distribution of each variable (e.g. ears) in the current data, and uses those bounds to create new data, shines. Discriminative AI/ML does not have this capability.
So, an Unsupervised algorithm’s priority is not Conditional Probability. Note that there is no target variable.
This is like a combination of the above 2 — a small set of labelled data (with target variable) is provided for Supervised learning. Also, there is a larger data set without labels (target variable) where the algorithm needs to learn like in the Unsupervised approach.
|GenAI algorithms normally fit well into the Un/Semi Supervised realm.
Naïve Bayes is an example of GenAI. It was introduced in 1960, about 60 years back.
Naïve Bayes, and GenAI in general, is founded on Bayes Theorem, stated about 250 years back.
So, no, GenAI is not new! It has made an impact recently due a perfect storm of various factors — Transformer architecture (the ‘T’ in GPT), availability of an unprecedented amount of data, large GPU farms etc.
|This post gives an under-the-hood understanding of why GenAI is just one part of the overall AI umbrella.
It is neither the ‘next version’ of AI superseding all before it, nor a ‘one size fits all’ solution.
It’s just one more weapon in your AI repertoire, albeit a solid one if used judiciously, for solving business problems.