Microsoft’s long term generative AI strategy will put it out in front of Google as the World’s largest ad seller.

Matt White
8 min readMar 25, 2023
Microsoft’s Clippy re-contextualized. Image modified by Author.

Disclaimer: I do not work for Microsoft and my comments here are mine and mine alone. They do not represent the views of or any insights I have gleaned from my employers.

Late last year I provided a list of my predictions in generative AI to a select list of journalists and media outlets. Of those predictions were eight strategic decisions I expected to be made by Microsoft, and one prediction related to Microsoft’s overall strategy on what would be next for OpenAI’s next iteration of its large language model, GPT-4. Of the nine total predictions five have already come to fruition and we are only half-way through March, a pace much faster than I had anticipated.

For OpenAI’s GPT-4 I predicted that the model, which was already trained by August of 2022, would be multi-modal ✅ and would not be a bloated 100 trillion parameter model as some people were promoting (Sam Altman had dispelled this myth in August of 2022, although folks really loved posting that two circle image all over the Internet.) Due to how well transformers perform with both text and image contextualization and the relationships between the two forms of multimedia I suspected that this would be the next step for the line of generative pre-trained transformers (GPT.) Of course GPT-4 is only multi-modal on model inputs (prompts), being able to recognize both images and text, but only able to generate text (text-to-text/image-to-text.) However this is an important step in the right direction for large language model (LLM) utility and creating a more integrated and valuable experience for users, however it should be noted that it is not the only path forward.

For Microsoft I made the following predictions (fulfilled predictions preceded by a check mark):

  • Microsoft would integrate ChatGPT into Bing Search through the industry trend I earlier referred to as the transition from search to chat. This would end up being Microsoft’s first action as a response to the success of ChatGPT.
  • Microsoft would integrate ChatGPT into their Microsoft Office Suite initially as a method to enhance productivity and increase their user base. This would have the effect of bringing in additional revenues and be market differentiator for them. This same action would have the effectively eroding the business models of a large number of startups that have built products on top of the GPT-3 API like Jasper, and many others.
  • Microsoft would initially build a system around ChatGPT on Azure cloud services and Office 365 that would facilitate the ability to make API calls storing all user prompts and model outputs in a database hosted in Azure. With a note that at a future date OpenAI would make a subsequent version or release of GPT that would incorporate multimodality and eventually a retrieval model architecture to make calls to tools and content directly (as described in my article on Hyper-personalization.)
  • Microsoft would experience an embarrassing moment or potentially end up defending a lawsuit due to harmful unexpected behavior of ChatGPT. This has occurred with it’s alternate personality Sydney, but the incident was not as damaging as I would have anticipated so perhaps the worst is yet to come.
  • Microsoft would integrate a GPT model into Microsoft Dynamics 365 for internal company knowledge management and customer relationship management. Further ingraining the dependency of organizations on the features that make LLMs so compelling.
  • Microsoft, after developing a conversational AI channel with its customers through Office 365 would embrace hyper-personalization and integrate an LLM-backed personal assistant into Microsoft windows. This is the most important step in Microsoft’s strategy to corner the market on consumer relationships.
  • Microsoft would acquire OpenAI and convert the organization into its AI research arm and its Azure AI business. Microsoft announced they would invest up to $10 Billion in OpenAI, so I cannot say that this prediction has been realized but I suspect that at some point Sam Altman, Greg Brockman, Ilya Sutskever, Greylock and other investors will be looking for an exit. When it does happen, OpenAI’s foundational models will move into Azure AI platform.

What is next for Microsoft?

Folks are marveling over the news they will be able to work with Microsoft’s Office 365 productivity tools through a conversational interface and be able to automate much of the work they do today with the benefit of a fine-tuned large language model (ChatGPT). Gone are the days of Lorem Ipsum, now templates will be auto-generated with pre-populated relevant information (text and images) and laid out in an aesthetically pleasing manner all guided by text prompts. Content will be easily added to Microsoft Word and PowerPoint including images and visualizations. Users will setup Excel spreadsheets with prompts including auto-generating dummy records when needed or pulling them from a file or the Web. ChatGPT integration into Outlook will be used to auto-compose, auto-respond, prioritize and summarize emails.

However this is just the beginning of a new journey for Microsoft, to a much more important objective for the company. Microsoft is moving towards creating hyper-personalized experiences for its users. This involves understanding an individual consumer in various dimensions of preference, including more permanent traits like their interests, demographics and life circumstances to more temporal attributes like state of mind, mood and location to name a few. To fulfill this strategy Microsoft needs to corner the market on conversations with the customer, keep the customer engaged in rewarding and meaningful conversations with its service and circumvent switching to conventional search and discovery methods through the incumbent platform, Google, when looking for answers, code, templates and any other form of content.

With Bing Chat, Microsoft is currently storing all queries and responses along with their grounded versions of prompts (prompts modified by the system to achieve the best results from their large language model) and the safety filtered inputs and outputs in graph databases in Azure Cosmos DB. By doing this Microsoft is able to develop a profile of the user and analyzing the interactions they can understand a user’s interests, perform sentiment analysis and ultimately understand their customers at a deeper level.

In 2022 Microsoft’s search ad business revenues were $11.59 Billion, which is about 2.2% of all ad spend in 2022. Compare that with Google whose ad sales topped $224.47 Billion in 2022 which represented 30% of all ad market sales. Microsoft recognizes that the ad business is a lucrative market but one that they have had difficulty tapping into since Bing search only saw about 3.19% of all search queries conducted in 2022 where Google had 91.88% of all searches go through its platform. Microsoft’s main competitor is undoubtedly Google in many domains including productivity, search, web-based email, online storage, browser business and mobile.

Microsoft sees OpenAI and the success of ChatGPT as their savior and moved quick to integrate ChatGPT into Bing search and continues integration into Office 365 with some other enterprise platform integrations like Microsoft Teams, Sharepoint, Dynamics 365 and Dynamics CRM also in stream. But getting back to the end game for Microsoft, what they really want to do is corner customer conversations and although Cortana was a flop that ultimately went by the wayside, Microsoft would like to integrate a conversational AI into their Windows operating system and their platforms in order to achieve the following:

  • Integrate a personal assistant (PA) into Microsoft Windows that is out front of applications and the primary interface for interacting with the user.
  • Better understand customers by analyzing conversations and prompts from all of their customer facing platforms including the personal assistant.
  • Analyze any submitted text passed into their large language model framework (not just prompts but documents, spreadsheets, presentations, databases, emails, calendars) that are used for processing user requests.
  • Build profiles (preference models) for every customer based on interactions and information provided by users.
  • Engage customers through the personal assistant interface, ensure that the PA provides significant value and interacts with a customer in a very personalized manner that appeals to the user in dialog and utility, achieved through the preference model for each user.
  • Use an ad network to deliver targeted ads through suggestions (recommendations) that are highly personalized and relevant to individual customers through the trusted personal assistant.
  • Keep the customer using the personal assistant as much as possible to cut them off from using Google search.

This approach will give Microsoft the ability to own the customer, much like Google does today, however due to the highly personalized and useful nature of the personal assistant, users will adopt this as their primary form of interacting with online platforms (search, discovery, shopping, productivity, email and chat, etc…) Microsoft’s personal assistant platform ill ultimately become the most desirable platform due to the high conversion rate of advertisements on the platform, and will set it on a trajectory to become the World’s largest advertising platform.

There should however be consumer concern about privacy and security, one that is unprecedented since large language models need to process all text with prompts to create desired outcomes like summarization, auto-responses, etc… this means that all you do in Office 365 or any Microsoft platform will be captured, stored and processed in Microsoft’s Azure Cloud systems. Not withstanding the amount of visibility Microsoft will have into your personal and professional doings they will also have that level of viability into what is taking place in enterprises since they will need unencrypted text to input into the back-end large language model.

For enterprises they may want to consider hosting their large language models in-house so that they do not have to leak internal information and knowledge to a 3rd party (although many enterprises are already comfortable with this arrangement.) There does exist a real risk with data leakage as we witnessed with OpenAI’s bug that allowed users to see other users prior text prompts as well as leaked emails and payment information. There is also a risk with data leakage especially with PII (Personally Identifiable Information) data and company secrets as this information can be inputted into a large language model like ChatGPT and be re-exposed in subsequent responses when used with few-shot learning and prompt engineering.

The scenario I have outlined in this article is not a certainty, but are my predictions about Microsoft’s intentions, so please don’t invest or unload Microsoft stock based on my writings. ;)



