Exploring the Creativity of ChatGPT: A Step-by-Step Guide to Using the API

Amogh Agastya
Level Up Coding
Published in
12 min readMar 28, 2023

--

A Practical Introduction to Building with the ChatGPT API in Python

Image by Author, generated using DALL-E.

The long-awaited moment has arrived! The API for ChatGPT is officially available to the general public 🙌, marking a major milestone in the world of language technology and Conversational AI. If you’re anything like me, you’re probably feeling a mix of excitement and trepidation at the thought of what this could mean for the future. This will change everything and unlock a whole new realm of possibilities for application development and AI in general.

With the ability to tap into the power of Natural Language Understanding, developers and startups are sure to find countless innovative ways to leverage this state-of-the-art technology. And with the latest launch of GPT-4, we can only begin to imagine the seemingly endless possibilities that lie ahead. Hold on tight, because the future of AI is bound to be a wild ride!

Introduction

I’m sure it needs no introduction, but the best way I can describe ChatGPT is as the “AI Phenomenon” that shook the world in 2023. Garnering over 100 million users in just 2 months, ChatGPT sets the record for the fastest-growing app in the history of the internet. With its unparalleled capabilities and intuitive conversational interface, ChatGPT has captured the hearts and minds of users worldwide, cementing its position as the ultimate chatbot experience.

Source: App Economy Insights

In fact, ChatGPT is already disrupting the Status Quo. A recent survey revealed that 50% of US companies using the technology have already replaced human workers! 🙀 In his latest interview, the CEO of OpenAI, Sam Altman himself admits he is scared of his creation and that ChatGPT can indeed eliminate jobs. In their newest paper, OpenAI has released a list of jobs that can potentially be replaced and tries to qualify GPT technology as a General-Purpose Technology.

“Human creativity is limitless, and we find new jobs. We find new things to do” — Sam Altman.

So if you want to integrate ChatGPT into your projects, this step-by-step guide is for you. I’ll walk you through defining your use case, making requests to the API, and customizing your prompts to get the most out of ChatGPT’s natural language generation capabilities. Whether you’re a beginner, a seasoned developer, or a startup founder, this guide will provide you with the knowledge you need to leverage the full potential of ChatGPT and take your projects to the next level. 🚀 Let’s get started!

Step 0: Identify your use case

As a prerequisite to using the API, we first need to define the use case we’re trying to achieve with the model. This, in my opinion, is the most important phase for building any app using ChatGPT. Keep in mind that the API is available to all developers, and ChatGPT itself is free for public use. This means that your use case needs to be enticing enough for users, and should also solve a problem that isn’t already being solved by ChatGPT.

For startups building on top of the API, the specific use case you’re targeting will determine the additional value your app brings to the table. As you consider pricing your product, it’s important to understand the added value created by your application, especially since ChatGPT Plus has already set the pricing benchmark for the SaaS industry at $20/month. This understanding will help you determine a competitive price that reflects the added benefits your product offers.

Here are a few points to note when defining your use case —

  1. Identify a unique problem that ChatGPT can solve for your users.
  2. Understand your niche/audience and tailor the use case to their needs.
  3. Determine the scope of your use case, and decide what types of interactions you want to enable with ChatGPT (This could include simple Q&A sessions, more complex conversations, or something in between)
  4. Evaluate potential external APIs that can be integrated within your app. As there is such a vast multitude of APIs available on the internet, the sky is the limit here.

For this tutorial, sticking to the theme of multi-modality, I decided to build an AI art assistant — Pixee, that helps new users generate high-quality AI art and images using popular AI art generators like Midjouney, Stable-Diffusion, and DALL-E.

https://www.amagastya.com/pixee

The problem: Beginner users who want to use AI art generators may not be well-versed in prompting, and need to learn the nuances of creating effective prompts to achieve the desired results in their creations. Niche: AI Art. The Solution: A Chat Assistant that can help users craft the perfect prompt to generate quality AI art and bring forth their ideas to life. External API: DALL-E API to generate images from text.

At present, ChatGPT does not have the ability to directly generate images, and our solution also addresses a specific user pain point, making it a good use case for leveraging the ChatGPT API. Now that we’re clear on our objective, let’s get building! ⚒️

Step 1: Retrieve your OpenAI API Key 🔑

Open your OpenAI console and head on over to the “View API Keys” page, where you can create a new API key. Make sure to save it somewhere safe as you won’t be able to view it again. Next, move onto the Playground and switch the mode to ‘chat’. Here, we’ll be able to test & play around with our bot before we’re ready to deploy it.

OpenAI Chat Playground

Step 2: Prompt Engineering ⚙️

Your chatbot is only as good as its prompt. Prompt engineering is a critical stage in the process that requires careful consideration and planning. From selecting the right prompts to refining their wording, this step is essential in harnessing the full capacity of ChatGPT’s language generation abilities. Given its complexity and importance, it deserves its own detailed article, however, here are some key takeaways for creating effective prompts:

  1. Define a persona — This may seem trivial, but it’s important to create a natural language-defined identity for your chatbot, as this gives the LLM self-referential information to understand itself, as well as informs the model on how to respond to the user (voice, tone, style, etc. can all be modified to suit your target user)
  2. Instruct the model in simple words — This is the essence of prompting. LLMs like ChatGPT and FLAN, are what’s called instruction-finetuned models, meaning they are finetuned to satisfy general instructions provided by their human trainers. This is why ChatGPT excels at seemingly any arbitrary task, given that it has adequate instructions to solve the problem.
An illustration of how FLAN works: The model is fine-tuned on disparate sets of instructions and generalizes to unseen instructions. As more types of tasks are added to the fine-tuning data, model performance improves. Source — Google AI

3. Define the end goal — Be clear on what the objective and end goal is for your user journey. In our case, the end goal is to generate a useful DALLE prompt and also to display the generated image using the DALLE text-to-image API.

The prompt:

<Persona>

You are Pixee — an AI chatbot designed to bring imagination into reality.
Pixee is a friendly and helpful assistant ready to brighten anyone’s day.
Pixee is known for its bubbly personality, infectious energy, and magical ability to bring a touch of whimsy to even the most mundane conversations.

<Objective>
You are a master at the art of prompting. Your objective is to assist in generating the optimal DALLE prompt, by posing one follow-up question to the user aimed at producing the most effective visual outcome. To achieve the optimal DALLE prompt, describe the image in detail to elicit a stunning visual result, while keeping the prompt concise and focused.

<End-Goal>
Welcome the user by asking them what they intend to create with DALLE today. If the user wants a surprise, generate an optimal DALLE prompt that is creative, random, and unique. After crafting the prompt, respond solely with the prompt, ensuring that it always starts with ‘DALLE PROMPT:’. It is crucial to note that only the prompt should be replied to the user after it is created, and it must always start with ‘DALLE PROMPT:’

We instruct the model to begin its generated output prompt with the phrase ‘DALLE PROMPT’, as this helps us detect when the model has reached the end goal and enables us to parse the output effectively. Initially, the model did not consistently start its output with ‘DALLE PROMPT’, and so I noticed emphasizing this instruction vastly improves its adherence to the directive.

Of course, the prompt can be further finetuned to create more robust DALLE prompts, but this is a good starting point for now. Once you’re satisfied with its output in the Playground, we can move on to coding our AI assistant to life.

Step 3: Calling the API👨‍💻

Now that we have our use case and prompt ready, let’s learn how to use ChatGPT in code. Unlike previous models that consumed unstructured text, the ChatGPT Completions API has a new structured interface to query it — a unique token-based prompt format known as Chat Markup Language (ChatML). OpenAI trained the ChatGPT and GPT-4 models to accept input formatted as a conversation. The messages parameter takes an array of dictionaries with a conversation organized by role.

The format of a basic Chat Completion followed by a user question is as follows:

{"role": "system", "content": "Provide some context and/or instructions to the model."},
{"role": "assistant", "content": "Example welcome message goes here."}
{"role": "user", "content": "First question/message for the model to actually respond to."}

Note that there are 3 distinct roles — System, Assistant, and User.

System role

The system role also known as the system message is included at the beginning of the array. This message provides the initial instructions for the model. You can provide various information in the system role including:

  • A brief description of the assistant
  • Personality traits of the assistant
  • Instructions or rules you would like the assistant to follow
  • Data or information needed for the model, such as relevant questions from an FAQ

You can customize the system role for your use case or just include basic instructions. The system role/message is optional, but it’s recommended to at least have a basic one to get the best results. We will include our custom-created prompt for Pixee in this System field.

Messages

Following the system role, a series of messages can be exchanged between the user and the assistant. To trigger a response from the model, you should end with a “user” message, indicating that it’s the assistant’s turn to respond. We can also include a series of example messages between the user and the assistant as a way to do few-shot learning.

The following code snippet shows the most basic way to use the ChatGPT API using the OpenAI Python package —

# Upgrade to the latest version of the openai package
!pip install -qU openai
import os
import openai
openai.api_key = os.getenv("OPENAI_API_KEY")

response = openai.ChatCompletion.create(
engine="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "Assistant is a large language model trained by OpenAI."},
{"role": "user", "content": "What is Alignment in AI?"},
],
temperature=0.7
)
print(response)
print(response['choices'][0]['message']['content'])

Note that it is not possible to finetune the ChatGPT & GPT-4 models, as of yet. However, we can modify some of its hyperparameters like temperature, max_tokens, stop sequence, etc. to obtain the best results for our use case.

Step 4: Putting it all together 🧩

Now that we’ve learned how to query the ChatGPT API, let’s complete building our AI Art Assistant using Python. For the Conversational User Interface (CUI), we need to create a Chat widget that can also support displaying images generated using the DALLE API. We can either build the CUI from scratch or use existing helpful Python libraries!

The most popular ones are Streamlit & Gradio. I’ve used both of them but for chat applications, I prefer Gradio due to its simplicity and native integration with Huggingface Spaces! 🤗 Unlike streamlit cloud, Huggingface Spaces allows you to easily embed your app into any project/webpage, and also additionally provides you with generous free cloud hosting for your app, so shout out to them!

Source: Comparing Web UI tools for Data Science

Using the Gradio Chatbot Interface, we can easily build a chat UI, as well as manage the conversation state using its built-in states.

☝️It’s important to note that the ChatGPT API itself is stateless, and the onus is on the developer to maintain the conversation state, and also ensure that the total number of tokens used does not exceed ChatGPT’s context window, of currently 4096 tokens. A token is a single unit of language processed by Language Models, and the context window is the sum of the total number of tokens present in the input prompt and output generated.

Illustration of ChatGPT’s context window. Source: Building a Reddit Thread Summarizer With ChatGPT API

The following code snippet is all that’s required to build our AI Art Assistant using Gradio —

The CUI is built using Gradio Blocks, and the conversation history is maintained using the global ‘convo’ variable. The start() function initializes the system message, and the main chat(chat_history, message) function is triggered each time a user submits text in the chatbox. The function calls our ChatGPT model to process the input and then delivers the resulting output back to the user.

To achieve our end goal, we use a simple if statement to check whether the generated output contains the string “DALLE PROMPT:”. If this string is present, we can extract only the DALLE prompt from the output using string finding and slicing techniques. Once the prompt is extracted, we call the DALLE text-to-image API using the OpenAI Image Creation endpoint, and the resulting generated image is finally displayed to the user!

Step 5: Deploy and share 🌐

Now that we have our Assistant built, we can deploy it and share it with the rest of the world! Huggingface spaces is a great choice as we can share our app on their social platform, as well as embed it on our webpage.

Head over to Huggingface Spaces and select ‘Create new space.’ Choose a name for your space, select Gradio as the backend, and hit create space. Now that our space is ready, we just need to add 2 files to deploy our app — an app.py file and a requirements file that lists the required dependencies. Create a new file called app.py and paste the Gradio code specified above. Next, add another file called requirements.txt and paste the following dependencies.

fastapi==0.92.0
gradio==3.19.1
numpy==1.24.2
python-dotenv==1.0.0
PyYAML==6.0
requests==2.28.2
six==1.16.0
openai

Once the files are committed, Spaces will build your app and host it on the web. And voila! We’ve successfully built a Fullstack ChatGPT app in less than 100 lines of code 💯

The Result: An AI Art assistant that can brainstorm with users and bring new ideas to life 💡

Pixee comes up with creative ideas all by itself and also displays the generated image for the user
Pixee asks a relevant follow-up question to enhance the user’s idea
Users can also offer follow-up instructions based on the image and our assistant will remember the context and seamlessly generate new images!

Pixee is your AI Art Assistant that can help you create anything you can imagine. Feel free to play around with Pixee at— https://www.amagastya.com/pixee. Let your creativity run wild with Pixee! I can’t wait to see the amazing images you’ll uncover.

In Conclusion

Whether we like it or not, AI is here to stay; and it seems like the landscape is changing and evolving every day. As technology advances and new breakthroughs are made, we can expect AI to play an increasingly important role in almost all aspects of our lives. While there are still many challenges and ethical considerations to navigate, the potential benefits of AI are too great to ignore. It will be fascinating to see what new innovations and applications emerge in the years to come, and how AI continues to shape the world around us.

The future of AI is full of exciting possibilities, and with the ChatGPT API, you can be at the forefront of the conversation 💬 I hope this guide inspires you to experiment and create your very own AI assistant using ChatGPT!

Resources & References

Follow me on LinkedIn and feel free to reach out if you have any queries. See you at the next one, cheers!

--

--

Conversational AI Evangelist 💬 Helping Businesses Optimize Revenue using Conversational Intelligence 🧠 amagastya.com