JUST ASK: Disrupting Business Applications
How Large Language Models and Embedded, Proprietary Data will Replace Traditional Business Applications as We Know Them
(*not written with generative AI)
With the record breaking pace of adoption of ChatGPT, most of the world is being entertained by the ChatGPT “parlor tricks” of writing poems, limericks, recipes and even how it can be used to cheat on homework.
Meanwhile, a disruption is bubbling under the surface for business.
Business leaders need to be looking hard into Large Language Models because this new paradigm is making access to vast amounts of information universal and is rapidly leading to the commoditization of domain knowledge.
Once all knowledge workers have access to the same data, the next wave for businesses will come from how they leverage their own proprietary data in conjunction with these language models.
By combining their own data with the capabilities of large language models, businesses will be able to gain new insights and access information that was previously difficult or impossible to uncover.
And because ChatGPT is introducing millions to the concept of interacting directly with data using natural language, it is effectively conditioning the world to a new way of interacting with information.
This will allow for a revolutionary way of working with data, as workers will be able to “just ask” their data any question in a natural, unrestricted way.
This shift in data access and analysis has the potential to disrupt traditional business applications and the way that workers currently access their data.
This will lead to a more efficient and intuitive way of working with data, potentially leading to new discoveries and innovation.
No longer will developers of business applications need to be concerned with how to dashboard, visualize, report or workflow business data, instead users will now be able to “Just Ask”.
Ask Is The New UX
This article will explore how “Asking is the new UI” of business applications and how this will likely effectively eliminate the need for applications in favor of direct access to embedded proprietary data in a hyper-personalized manner.
Embedding Data into ChatGPT
To understand why embedded data is so important to these new experiences, we will first need to understand the relevance of the data in these models.
ChatGPT is a variant of GPT-3, which stands for “Generative Pre-trained Transformer 3.” GPT-3 is a large language model developed by OpenAI that uses deep learning to understand and generate human-like language.
ChatGPT is a specific implementation of GPT-3, technically GPT-3.5. GPT-3.5 is a fine-tuned version of GPT-3 optimized for conversational tasks such as chatbots and Q&A systems, influenced by human reinforcement learning.
The data set used to train GPT is called the Common Crawl dataset, which is a large corpus of text data collected from the internet. The model has been trained on a wide variety of texts, including literature, news articles, scientific papers, and online forums, among others. This allows GPT-3 to have a broad understanding of the language and be able to generate coherent and fluent text in a wide range of contexts and styles. The training dataset is also very large, containing over 570GB of text, providing the model with a huge amount of information to learn from.
It’s worth noting that GPT-3 is only as current as the last time the model was trained. This means that if a model was last trained with data from a specific period, the information it generates will be based on that data and may be out of date.
For example, if a model was last trained with data from 2020 and the current date is 2023, the model will not have any knowledge about events or information that has occurred since 2020.
You’ll note this in all of the GPT documentation from OpenAI and Microsoft.
As a result, it is important to understand that the information provided by GPT-3 and ChatGPT may be out of date, especially when it comes to current events or recent developments.
The remedy is to embed data sets into your LLM interactions to bridge the gap of out of data date within the original corpus.
For example, here are some examples of updating the model data within a chat session by embedding data directly into a ChatGPT prompt:
Copying and pasting data arrays directly into your prompt will make them available to the active chat for analysis.
Part of the magic of the chat format is that any data included in any of the prompts and subsequent completions are then be available later in the chat conversation. This allows users to ask questions of the supplied data sets at any point of the chat conversation.
In the case above, we asked for a summarization of a “million dollar home” based on the data set provided in the initiating prompt.
The next clip shows the same data set used in a new chat conversation, but this time we as a different question.
Keep in mind that since both of these chat examples reference the same data set, both questions could have been “asked” of the data within the same chat conversation.
Let’s explore how that might work in this next video:
This time a single data set of play-by-play from the NFL “Big Data Bowl” competition is used. Contestants of the Big Data Bowl use traditional football data and Next Gen Stats to analyze and rethink trends and player performance.
We thought it would be interesting to “Just Ask” a data set a series of questions using ChatGPT.
In addition to basic queries, users can really leverage the language models by asking interesting strategic questions like the “which defender gave up the most yards” question in this example.
A small tweak to the concept of “queries” should be made in this case and simply call these “inquiries” since this suggests a more human-like transaction.
"Inquiries" suggests a more human-like transaction
In this specific use case, we are trying to emulate how a coaching staff might benefit from faster access to the data.
Practically, say this in-game data is collected in real time during a game. In order to leverage the speed of retrieval, users would need the fastest possible way to gain insights from the data in real time in order for the data to be effective.
When every second counts, users can “just ask” questions directly of the data rather than wasting time to first having to process data through an application.
For example, in this next clip, we supply multiple data sets. Each data set contains shared key values that “infer” relationships between the sets.
The language model has the intelligence to relate the multiple data sets through these keys, offering the ability to broaden our questions to query vast arrays of related data.
The language model also has the intelligence to create expressions, like “LineTotal” and “OrderTotal” within the conversation. Likely because ChatGPT is a fine-tuned on BOTH code (like SQL) and text.
The ability to ask questions that involve joins and relationships, like “Which customers should we notify of a product recall?”, allows the user to gain deeper insights from the data and make more informed decisions. This requires LLMs to evaluate Customer data into Orders, then into Order Lines and then into Products.
Within the same conversation, the model can then author an intelligent Sales Summary along with strategic insights because Large language models like ChatGPT have the ability to understand and generate human-like language. This makes it easy for users to ask questions (or “inquire”) of their data in a natural and conversational way, and to read results of their inquiries.
Soon there will be evolutions of LLMs that can do advanced calculations, and more will allow for integration of calculator modules to provide this functionality.
The language model also plays an important role in formatting and summarizing the responses, making it easy for the users to understand the information in the responses to their “inquiries.
This process will allows as any number of users can ask any number of questions of potentially multiple related data sets with each user receiving exactly the information they need.
Traditionally, it takes a lot of work to design a business application that accommodates the user’s needs and create effective workflows to allow them access to the data. This complexity has given rise to the recent trend in “low code” application platforms intended to decrease the application development timelines.
Even the best efforts of developers using these modern platforms cannot anticipate every user’s exact need. Nor can they adapt in real time to the needs of the users as either data evolves or user’s needs change.
However, with API calls to language models like GPT, developers can have access to the same proprietary data sets in these business applications, users would instead get any questions answered from the data that they can think of. As the users or their questions change, no application will need to be updated.
Thus, providing a truly dynamic user experience with endless possibilities for output that adapts in real time to user needs.
Inquiries introduce a concept of “zero code” applications that would not even need any form of UI. All the applications would need would be an interface to allow users to ask questions and display the responses.
Additionally, embedding data sets can infuse data that was collected in real-time, thus circumventing the potential “stale” nature of LLM data and providing the most up-to-date information for the users.
Arguably, copying and pasting data sets into the public ChatGPT version could be considered a “parlor trick”.
Yet, there is a more formal way to inject data sets into API calls using the Embeddings models from either OpenAI or Microsoft.
In December 2022, OpenAI announced a “New and Improved Embedding Model”which is significantly more capable, cost effective, and simpler to use.
The new model,
text-embedding-ada-002, replaces five separate models for text search, text similarity, and code search, and outperforms their previous model, Davinci, at most tasks, while being priced 99.8% lower.
We will post future articles and videos on how to include data sets as embeddings through API calls and share explorations of existing calculator model approaches.
Embedding proprietary data into large language model interactions allows businesses to make the most efficient use of their data by easily searching and analyzing it in a natural, conversational way.
This could help businesses make more informed decisions, uncover new opportunities, and improve their operations through this revolutionary way to query data.
In addition to the chat functionality, ChatGPT has also provided a new user interface experience that allows users to ask direct questions of their data without the need for an application with it’s own presentation layer.
The creation of the ChatGPT chat interface should be viewed as a template for the new way for users to interact with data.
As millions of new users experience ChatGPT, they are also being conditioned (or trained) on an INPUT/RESPONSE/HISTORY (IRH) user interface. They know to ask questions in an INPUT field on the bottom of a webpage, then see their RESPONSE streamed above the input field, while seeing a HISTORY of all other chats listed along the left side of the window.
This IRH template could be applied to other applications that leverage the ChatGPT API on the backend along with embedded data, thus eliminating any need to train users since they will reference what they learned while using ChatGPT.
“Ask” Is The New UI
Unlike traditional business applications, which require users to navigate through multiple menus and interfaces to find the information they need, LLMs allow users to simply ask their data any question in a natural and unrestricted way.
This can lead to a more efficient and intuitive way of working with data, potentially leading to new discoveries and innovation.
The ability to inquire in a conversational manner with data would also eliminate the need for users to have prior knowledge of the structure of the data or the application itself, allowing for wider accessibility to the data.
This will enable businesses to democratize access to data and insights, enabling more people to make data-driven decisions, which could lead to significant improvements in the way businesses operate.
This revolutionary new way of interacting with data by using inquires, combined with the radical new IRH user interface concept, has the potential to disrupt business applications as we know them, leading to a more efficient, intuitive, and accessible way of working with data.
At iSolutionsai we built a team of machine learning experts, data scientists, model programmers and prompt engineers in order to build custom models as a service for our customers.
We create API-driven custom models and integrations with GPT-3/4 that are platform agnostic. Our portfolio continues to grow, even in a very young tech lifecycle.
If you want to learn more about how GPT-3/4 or custom machine learning models could benefit your business, feel free to reach out to us for a free consultation or even a demo of some of the amazing things these technologies can do.
Our experienced team can help you create:
- Custom Machine Learning Models
- GPT-3/4 Language Models
- Computer Vision Models
- AI Chatbots
- Learning Companions
- Community Companions
- Sales Companions
- Marketing Companions
In Orange County, CA? Come join the Ben’s Bites Meetup to talk all things AI, including this very topic.