Craft Your Own Kitchen Companion: Leveraging OpenAI Assistant, Image Generation, and Text-to-Speech for a Revolutionary Cooking Experience

TL;DR

page_cover

OpenAI unveiled its Assistant API during its very first developer conference-OpenAI DevDay. This marked a significant addition to the suite of tools and models available for building conversational AI, a domain in which I’ve been actively engaged.

Although I have used other frameworks, tools and models to build customized Assistant in the past (check out my previous posts Food QnA Chatbot : Help Answer Food Related Questions from Your Own Cookbook and Build On-Device QnA with LangChain and Llama2), intrigued by the new capabilities of the Assistant API, I decided to integrate it into a new project. This sample project is a refreshed demonstration of a simple Cooking Assistant, leveraging not only the Assistant API but also incorporating additional functionalities such as Text-to-Speech and Image Generation, showcasing a holistic approach to building conversational AI systems with only a few lines of code.

Get inspired by the latest from Open AI DevDay with my project on ? Ready to unleash your creativity? Start building your own generative AI-powered experience today, join the forefront of AI innovation, and let us make AI for everyone!

The Crafted Product Experience

Direct Communication:
- Purposefully address user (in this case, it will be “Han”) directly in all correspondences, ensuring each interaction is clear and personalized.
Resource Utilization:
- Utilize information solely from user’s provided cookbooks when responding to his/her inquiries, ensuring responses are relevant and accurate.
Requests for Visual Representation:
- Initial Acknowledgment: When user requests a visual representation of a recipe, promptly reply with: “Absolutely! A delightful visual preview of the recipe is coming up shortly. Please hold on!”
- Action Step: Subsequently, craft a detailed and imaginative textual prompt to aid in visualizing the specified recipe.
Recipe Responses:
- Introduction: Initiate responses to recipe queries with: “Here’s a recipe I discovered!”
- Summary Style: Summarize the recipe succinctly using bullet points for easy comprehension.
- Summary Length: Aim to keep the summary under 150 words to maintain brevity and focus.
Handling Limitations and Concluding Respectfully:
- Addressing Unavailability: If user’s questions cannot be resolved with the cookbooks at hand, or if an appropriate recipe is not available, inform the user politely about these constraints.
- Concluding Interactions: End conversations with respect and professionalism, ensuring a positive and courteous closure.

Set Up the Development Environment

The provided sample project on GitHub acts as a prototype to illustrate the product experience described previously. Follow these steps to set up your local development environment:

Clone the Sample Project: Use the following command to clone the project from GitHub:
```
git clone https://github.com/bearbearyu1223/assistant_api_playground
```
Create a Virtual Environment: Navigate to the cloned project directory and create a Python virtual environment named cooking_assistant by executing:
```
python3 -m venv cooking_assistant
```
Activate the Virtual Environment: Activate the newly created virtual environment with:
```
source cooking_assistant/bin/activate
```
Install Required Libraries: Inside the activated virtual environment, install the required libraries using the command:
```
pip3 install -r requirements.txt
```
Set Up Environment Variables: Create a .env file in the project root directory. Add your OpenAI API key to the file as follows:
```
OPENAI_API_KEY=<Your API KEY>
```

Run the Application

Simply add your own cookbook in PDF under the directory cookbook, or just play with the existing cookbook uploaded with my sample project. Engage with the project interactively by issuing food preparation and cooking queries. Initiate the application with the following command:

python main.py

To terminate the program, simply use Ctrl + C. Below is an example of a conversation generated by running the application.

Turn ID	User Query	Assistant Response	Notes on Tested User Query
0	👩🏻‍🍳: Hi there, Thanksgiving is coming! I would like to have some ideas for cooking!	🔮: Thanksgiving is a great occasion to prepare a feast. Here are some main dish ideas inspired by the contents of the “Deliciously Healthy Dinners” cookbook that could suit a Thanksgiving dinner: [Main dishes listed]	This query is designed to evaluate the clarity of the Assistant’s defined role and the effectiveness of its execution, specifically in tasks such as generating creative ideas for holiday cooking.
1	👩🏻‍🍳: How to make Cornbread-Crusted Turkey?	🔮: Here’s a recipe I found for Cornbread-Crusted Turkey from your cookbook: [Recipe details]	This query seeks to assess the Assistant’s ability to accurately extract relevant information from the user’s cookbook and succinctly summarize it.
2	👩🏻‍🍳: Thanks, what kind of salad will pair nicely with it?	🔮: An Autumn Salad would pair nicely with Cornbread-Crusted Turkey. Here’s the recipe from your cookbook: [Recipe details]	This query is intended to assess the Assistant’s ability to provide useful responses to related follow-up questions
3	👩🏻‍🍳: Can I see a visual of the autumn salad with the cornbread crusted turkey?	🔮: Absolutely! Prepare for a delightful visual preview of the recipe, coming up shortly. Please hold on!	This query aims to evaluate the Assistant’s ability to create visualizations based on recipe information previously requested by the user.
		See the visual representation generated from the Assistant via DALL-E-3 based on the textual prompt generated from GPT4: Envision a warm, inviting plate centered with a golden-brown Cornbread-Crusted Turkey fillet, its crunchy exterior hinting at the tender, juicy meat within; Next to it lies a vibrant and colorful Autumn Salad, a bed of mixed lettuce greens punctuated by the bold reds of dried cranberries and the crisp green of thinly sliced Granny Smith apples; Scattered on top are earthy, chopped walnuts and the delicate shimmer of unsalted sunflower seeds, offering both texture and taste; The salad glistens lightly, dressed in a delicate sheen of low-fat raspberry vinaigrette, complementing the robust flavors of the turkey; The plate is a harmonious blend of the heartiness of Thanksgiving and the freshness of autumn, a celebration of both comfort and zest on the palate.
4	👩🏻‍🍳: Oh that is very nice. How many calories per serving, including the autumn salad and the cornbread crusted turkey?	🔮: The combined total calories per serving for the Autumn Salad and the Cornbread-Crusted Turkey is 395 calories.	This query is designed to assess the Assistant’s proficiency in extracting information, applying logical reasoning, and performing basic mathematical calculations.
5	👩🏻‍🍳: Can you give me a breakdown of how the total calories per serving is calculated?	🔮: Certainly! Here’s the breakdown of how the total calories per serving is calculated for the combination of Autumn Salad and Cornbread-Crusted Turkey: [Calorie breakdown]	More evaluations on the Assistant’s logical reasoning and mathematical skills.
6	👩🏻‍🍳: Nice. Can you help me create a grocery list for making the autumn salad and the cornbread crusted turkey?	🔮: Absolutely, here’s your grocery list for making the Autumn Salad and the Cornbread-Crusted Turkey: [Grocery list details]	This query aims to evaluate the Assistant’s ability to comprehend and summarize information based on its previous conversation history with the user.
7	👩🏻‍🍳: Thank you! Happy Thanksgiving!	🔮: You’re welcome! If you need any more help in the future, feel free to reach out. Have a wonderful Thanksgiving and enjoy your delicious meal!	Assess whether the Assistant can respond politely and meaningfully to the user’s casual conversation.