Exploring the Creativity of ChatGPT: A Step-by-Step Guide to Using the API
A Practical Introduction to Building with the ChatGPT API in Python
The long-awaited moment has arrived! The API for ChatGPT is officially available to the general public đ, marking a major milestone in the world of language technology and Conversational AI. If youâre anything like me, youâre probably feeling a mix of excitement and trepidation at the thought of what this could mean for the future. This will change everything and unlock a whole new realm of possibilities for application development and AI in general.
With the ability to tap into the power of Natural Language Understanding, developers and startups are sure to find countless innovative ways to leverage this state-of-the-art technology. And with the latest launch of GPT-4, we can only begin to imagine the seemingly endless possibilities that lie ahead. Hold on tight, because the future of AI is bound to be a wild ride!
Introduction
Iâm sure it needs no introduction, but the best way I can describe ChatGPT is as the âAI Phenomenonâ that shook the world in 2023. Garnering over 100 million users in just 2 months, ChatGPT sets the record for the fastest-growing app in the history of the internet. With its unparalleled capabilities and intuitive conversational interface, ChatGPT has captured the hearts and minds of users worldwide, cementing its position as the ultimate chatbot experience.
In fact, ChatGPT is already disrupting the Status Quo. A recent survey revealed that 50% of US companies using the technology have already replaced human workers! đ In his latest interview, the CEO of OpenAI, Sam Altman himself admits he is scared of his creation and that ChatGPT can indeed eliminate jobs. In their newest paper, OpenAI has released a list of jobs that can potentially be replaced and tries to qualify GPT technology as a General-Purpose Technology.
âHuman creativity is limitless, and we find new jobs. We find new things to doââââSam Altman.
So if you want to integrate ChatGPT into your projects, this step-by-step guide is for you. Iâll walk you through defining your use case, making requests to the API, and customizing your prompts to get the most out of ChatGPTâs natural language generation capabilities. Whether youâre a beginner, a seasoned developer, or a startup founder, this guide will provide you with the knowledge you need to leverage the full potential of ChatGPT and take your projects to the next level. đ Letâs get started!
Step 0: Identify your use case
As a prerequisite to using the API, we first need to define the use case weâre trying to achieve with the model. This, in my opinion, is the most important phase for building any app using ChatGPT. Keep in mind that the API is available to all developers, and ChatGPT itself is free for public use. This means that your use case needs to be enticing enough for users, and should also solve a problem that isnât already being solved by ChatGPT.
For startups building on top of the API, the specific use case youâre targeting will determine the additional value your app brings to the table. As you consider pricing your product, itâs important to understand the added value created by your application, especially since ChatGPT Plus has already set the pricing benchmark for the SaaS industry at $20/month. This understanding will help you determine a competitive price that reflects the added benefits your product offers.
Here are a few points to note when defining your use caseââ
- Identify a unique problem that ChatGPT can solve for your users.
- Understand your niche/audience and tailor the use case to their needs.
- Determine the scope of your use case, and decide what types of interactions you want to enable with ChatGPT (This could include simple Q&A sessions, more complex conversations, or something in between)
- Evaluate potential external APIs that can be integrated within your app. As there is such a vast multitude of APIs available on the internet, the sky is the limit here.
For this tutorial, sticking to the theme of multi-modality, I decided to build an AI art assistant â Pixee, that helps new users generate high-quality AI art and images using popular AI art generators like Midjouney, Stable-Diffusion, and DALL-E.
The problem: Beginner users who want to use AI art generators may not be well-versed in prompting, and need to learn the nuances of creating effective prompts to achieve the desired results in their creations. Niche: AI Art. The Solution: A Chat Assistant that can help users craft the perfect prompt to generate quality AI art and bring forth their ideas to life. External API: DALL-E API to generate images from text.
At present, ChatGPT does not have the ability to directly generate images, and our solution also addresses a specific user pain point, making it a good use case for leveraging the ChatGPT API. Now that weâre clear on our objective, letâs get building! âď¸
Step 1: Retrieve your OpenAI API Key đ
Open your OpenAI console and head on over to the âView API Keysâ page, where you can create a new API key. Make sure to save it somewhere safe as you wonât be able to view it again. Next, move onto the Playground and switch the mode to âchatâ. Here, weâll be able to test & play around with our bot before weâre ready to deploy it.
Step 2: Prompt Engineering âď¸
Your chatbot is only as good as its prompt. Prompt engineering is a critical stage in the process that requires careful consideration and planning. From selecting the right prompts to refining their wording, this step is essential in harnessing the full capacity of ChatGPTâs language generation abilities. Given its complexity and importance, it deserves its own detailed article, however, here are some key takeaways for creating effective prompts:
- Define a persona â This may seem trivial, but itâs important to create a natural language-defined identity for your chatbot, as this gives the LLM self-referential information to understand itself, as well as informs the model on how to respond to the user (voice, tone, style, etc. can all be modified to suit your target user)
- Instruct the model in simple words â This is the essence of prompting. LLMs like ChatGPT and FLAN, are whatâs called instruction-finetuned models, meaning they are finetuned to satisfy general instructions provided by their human trainers. This is why ChatGPT excels at seemingly any arbitrary task, given that it has adequate instructions to solve the problem.
3. Define the end goal â Be clear on what the objective and end goal is for your user journey. In our case, the end goal is to generate a useful DALLE prompt and also to display the generated image using the DALLE text-to-image API.
The prompt:
<Persona>
You are Pixee â an AI chatbot designed to bring imagination into reality.
Pixee is a friendly and helpful assistant ready to brighten anyoneâs day.
Pixee is known for its bubbly personality, infectious energy, and magical ability to bring a touch of whimsy to even the most mundane conversations.<Objective>
You are a master at the art of prompting. Your objective is to assist in generating the optimal DALLE prompt, by posing one follow-up question to the user aimed at producing the most effective visual outcome. To achieve the optimal DALLE prompt, describe the image in detail to elicit a stunning visual result, while keeping the prompt concise and focused.<End-Goal>
Welcome the user by asking them what they intend to create with DALLE today. If the user wants a surprise, generate an optimal DALLE prompt that is creative, random, and unique. After crafting the prompt, respond solely with the prompt, ensuring that it always starts with âDALLE PROMPT:â. It is crucial to note that only the prompt should be replied to the user after it is created, and it must always start with âDALLE PROMPT:â
We instruct the model to begin its generated output prompt with the phrase âDALLE PROMPTâ, as this helps us detect when the model has reached the end goal and enables us to parse the output effectively. Initially, the model did not consistently start its output with âDALLE PROMPTâ, and so I noticed emphasizing this instruction vastly improves its adherence to the directive.
Of course, the prompt can be further finetuned to create more robust DALLE prompts, but this is a good starting point for now. Once youâre satisfied with its output in the Playground, we can move on to coding our AI assistant to life.
Step 3: Calling the APIđ¨âđť
Now that we have our use case and prompt ready, letâs learn how to use ChatGPT in code. Unlike previous models that consumed unstructured text, the ChatGPT Completions API has a new structured interface to query it â a unique token-based prompt format known as Chat Markup Language (ChatML). OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role.
The format of a basic Chat Completion followed by a user question is as follows:
{"role": "system", "content": "Provide some context and/or instructions to the model."},
{"role": "assistant", "content": "Example welcome message goes here."}
{"role": "user", "content": "First question/message for the model to actually respond to."}
Note that there are 3 distinct roles â System, Assistant, and User.
System role
The system role also known as the system message is included at the beginning of the array. This message provides the initial instructions for the model. You can provide various information in the system role including:
- A brief description of the assistant
- Personality traits of the assistant
- Instructions or rules you would like the assistant to follow
- Data or information needed for the model, such as relevant questions from an FAQ
You can customize the system role for your use case or just include basic instructions. The system role/message is optional, but itâs recommended to at least have a basic one to get the best results. We will include our custom-created prompt for Pixee in this System field.
Messages
Following the system role, a series of messages can be exchanged between the user and the assistant. To trigger a response from the model, you should end with a âuserâ message, indicating that itâs the assistantâs turn to respond. We can also include a series of example messages between the user and the assistant as a way to do few-shot learning.
The following code snippet shows the most basic way to use the ChatGPT API using the OpenAI Python package â
# Upgrade to the latest version of the openai package
!pip install -qU openai
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")
response = openai.ChatCompletion.create(
engine="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
{"role": "user", "content": "What is Alignment in AI?"},
],
temperature=0.7
)
print(response)
print(response['choices'][0]['message']['content'])
Note that it is not possible to finetune the ChatGPT & GPT-4 models, as of yet. However, we can modify some of its hyperparameters like temperature, max_tokens, stop sequence, etc. to obtain the best results for our use case.
Step 4: Putting it all together đ§Š
Now that weâve learned how to query the ChatGPT API, letâs complete building our AI Art Assistant using Python. For the Conversational User Interface (CUI), we need to create a Chat widget that can also support displaying images generated using the DALLE API. We can either build the CUI from scratch or use existing helpful Python libraries!
The most popular ones are Streamlit & Gradio. Iâve used both of them but for chat applications, I prefer Gradio due to its simplicity and native integration with Huggingface Spaces! đ¤ Unlike streamlit cloud, Huggingface Spaces allows you to easily embed your app into any project/webpage, and also additionally provides you with generous free cloud hosting for your app, so shout out to them!
Using the Gradio Chatbot Interface, we can easily build a chat UI, as well as manage the conversation state using its built-in states.
âď¸Itâs important to note that the ChatGPT API itself is stateless, and the onus is on the developer to maintain the conversation state, and also ensure that the total number of tokens used does not exceed ChatGPTâs context window, of currently 4096 tokens. A token is a single unit of language processed by Language Models, and the context window is the sum of the total number of tokens present in the input prompt and output generated.
The following code snippet is all thatâs required to build our AI Art Assistant using Gradio â
The CUI is built using Gradio Blocks, and the conversation history is maintained using the global âconvoâ variable. The start()
function initializes the system message, and the main chat(chat_history, message)
function is triggered each time a user submits text in the chatbox. The function calls our ChatGPT model to process the input and then delivers the resulting output back to the user.
To achieve our end goal, we use a simple if statement to check whether the generated output contains the string âDALLE PROMPT:â. If this string is present, we can extract only the DALLE prompt from the output using string finding and slicing techniques. Once the prompt is extracted, we call the DALLE text-to-image API using the OpenAI Image Creation endpoint, and the resulting generated image is finally displayed to the user!
Step 5: Deploy and share đ
Now that we have our Assistant built, we can deploy it and share it with the rest of the world! Huggingface spaces is a great choice as we can share our app on their social platform, as well as embed it on our webpage.
Head over to Huggingface Spaces and select âCreate new space.â Choose a name for your space, select Gradio as the backend, and hit create space. Now that our space is ready, we just need to add 2 files to deploy our appâââan app.py file and a requirements file that lists the required dependencies. Create a new file called app.py
and paste the Gradio code specified above. Next, add another file called requirements.txt
and paste the following dependencies.
fastapi==0.92.0
gradio==3.19.1
numpy==1.24.2
python-dotenv==1.0.0
PyYAML==6.0
requests==2.28.2
six==1.16.0
openai
Once the files are committed, Spaces will build your app and host it on the web. And voila! Weâve successfully built a Fullstack ChatGPT app in less than 100 lines of code đŻ
The Result: An AI Art assistant that can brainstorm with users and bring new ideas to life đĄ
Pixee is your AI Art Assistant that can help you create anything you can imagine. Feel free to play around with Pixee atâ https://www.amagastya.com/pixee. Let your creativity run wild with Pixee! I canât wait to see the amazing images youâll uncover.
In Conclusion
Whether we like it or not, AI is here to stay; and it seems like the landscape is changing and evolving every day. As technology advances and new breakthroughs are made, we can expect AI to play an increasingly important role in almost all aspects of our lives. While there are still many challenges and ethical considerations to navigate, the potential benefits of AI are too great to ignore. It will be fascinating to see what new innovations and applications emerge in the years to come, and how AI continues to shape the world around us.
The future of AI is full of exciting possibilities, and with the ChatGPT API, you can be at the forefront of the conversation đŹ I hope this guide inspires you to experiment and create your very own AI assistant using ChatGPT!
Resources & References
- ChatGPT Gradio Exploratory Notebook â https://colab.research.google.com/drive/1Tl4yqgY_EFUKaPpLrR0u5pdUmlXpA38f
- OpenAI Chat Completions â https://platform.openai.com/docs/guides/chat
- Gradio Chatbot Interface â https://gradio.app/creating-a-chatbot/
- https://www.resumebuilder.com/1-in-4-companies-have-already-replaced-workers-with-chatgpt/
- https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/chatgpt?pivots=programming-language-chat-completions
- https://betterprogramming.pub/building-a-reddit-thread-summarizer-with-chatgpt-api-5b0dcd50b88e
- https://analyticsindiamag.com/openai-publishes-yet-another-lame-paper/
- https://arxiv.org/abs/2303.10130
- https://openai.com/research/instruction-following
Follow me on LinkedIn and feel free to reach out if you have any queries. See you at the next one, cheers!
Also read, by me â