What do you do with a kid who walks early, speaks early, and starts reading before Pre-K? Of course, you nurture and encourage that child to develop their potential. What do you do with technology that reads and writes before its first birthday? And when it starts creating masterpieces of art and music and software code in mere months, then what? Maybe you, like IBM, teach it to pull reports from other apps, or like Microsoft, teach it to write and improve software code, or like Google, you streamline it and put it on devices like smartphones. As the parent of a child prodigy, it’s almost negligent not to help this child realize their full potential. And that’s exactly what seems to be happening with Generative AI technology: we have a gifted and talented child on our hands that the technology sector is determined to see flourish. Every day feels like a new skill, use, or adaptation has been identified as part of Generative AI’s vast and adaptable capacity. So, what’s our Gen AI whiz kid up to now?
For starters, multimodal models are becoming the default skillset for Gen AI. One example is the recent introduction of Sora, an AI model that can produce one-minute videos from text prompts (https://openai.com/sora). Still in testing and development, this advancement opens the world of video creation to those who can describe what they want to see, but don’t have the video editing skills to create a clip. This technology opens the possibility for sight impaired or physically challenged individuals to create videos more easily (especially with voice-enabled text prompts). Stability AI’s release of Stable Diffusion 3, similar to Sora, is another indication that multimodal text to video is not limited to Open AI’s realm (https://stability.ai/news/stable-diffusion-3).
Google, which has rebranded Bard as Gemini, introduced both Gemini Business and Gemini Enterprise plans for Workspace customers who want to integrate Gen AI into their Google collaboration software suites (https://blog.google/products/google-one/google-one-gemini-ai-gmail-docs-sheets/). Google had already integrated their Gen AI tool into Docs, Sheets, Slides, and Meets under the moniker Duet AI and is touting greater functionality with this release. Promising not to train their AI models on an organization’s data, Google hopes to lure organizations to their collaboration platform which is in competition with others such as Microsoft.
There are also custom AI chatbots available for harnessing the power of Large Language Models and Gen AI by integrating the technology into new and existing workflows. Companies like Zapier (https://zapier.com/ai/chatbot) are helping organizations create no-code chatbots that will interface with private knowledge databases to provide services for customers and employees, generate leads, and develop bespoke content. While some security analysts are wary of the security of the enterprise LLM models, including uploading confidential information in an LLM, companies offering these models are promising secure data storage and “clawback” functions to delete data from the LLM’s knowledge base.
While these recent advancements continue to mature, the next natural question is “What’s next for Gen AI?” One potentially underdiscussed use of Gen AI’s software coding skills is that of updating and/or replacing legacy software code. Late in 2023, IBM, creator of watsonx, and Microsoft, Open AI’s owner, announced that they would be collaborating on Gen AI projects to update the legacy software code (Cobol) used by many government agencies and financial institutions, which due to age, newer coding languages, and poor documentation can be difficult to maintain (https://fortune.com/2023/10/09/generative-ai-cobol-code-wall-street-ibm-microsoft/amp/). What was previously thought of as a job so large it was nearly impossible, is now being seen as possible, and in short order, ultimately providing more stability and security to the financial sector.
Generative AI (Gen AI) has come a long way in a very short time. When ChatGPT was released in November 2022, we had a fascinating new answer to what seemed to be a problem with the internet. Searches were returning pages and pages of unvetted links, as well as sponsored links to sites paying for the privilege to appear at the top of the results page. Less than a year later, data analytics firm McKinsey estimated that Gen AI technology would add up to $4.4 trillion annually to the economy (https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier#key-insights). We’re just entering year two of the Generative AI explosion of growth, and what lies ahead is as promising as it is unknown. Adoption is no longer a question of if, it’s a question of how many ways can Gen AI be incorporated into our organizations and personal lives – because this technology child is about to grow again.