ChatGPT enters a new agentic era

PLUS: Google’s Gemini AI can now process video and images simultaneously

Today in AI

Jan 16, 2025

Howdy! Happy Wednesday, AI family, and welcome back.

In today’s edition:

ChatGPT enters a new agentic era
Google’s Gemini AI can now process video and images simultaneously
Plus trending AI tools, posts, and resources

Ready, set, go…

🤖 ChatGPT enters a new agentic era

(Gif Source: OpenAI)

OpenAI is rolling out Tasks, a beta feature in ChatGPT that lets users schedule future actions and reminders, available exclusively to Plus, Team, and Pro subscribers.

Here's what you need to know:

To set up tasks, switch to “4o with scheduled tasks” on the model picker and type your reminder prompt. ChatGPT can also suggest tasks based on previous chats.
You can manage tasks directly in your chat threads or through the "Tasks" section in the web version under the profile menu, where you can modify or cancel them easily.
Notifications for completed tasks will be sent to users across web, desktop, and mobile platforms.
You can have up to 10 active tasks running at the same time. If you reach that limit, you must pause or delete tasks to create new ones.
The feature is still in beta, so there are a few limits, like no voice task setup and no support for continuous background searches or transactions.

Why it matters:

Tasks move ChatGPT closer to being a versatile AI companion, potentially reducing the need for other apps like Siri and Alexa to manage reminders.

This update also aligns with OpenAI's reported plans to launch "Operator," an autonomous computer-controlling agent, in January 2025, signaling a broader push toward AI-driven productivity.

SIDE UPDATES

Luma Labs’s Ray2 leaked

Luma Labs’ Ray2, an AI video model leaked on Monday, promises more realistic physics and smoother motion than OpenAI's Sora. It can create 10-second videos from text or images, featuring advanced cinematography and natural object interactions, including humans, animals, and vehicles. Early testers praise its ability to make videos look lifelike, with one filmmaker calling it “Holodeck is coming” after seeing a tightrope walker video. Ray2 will soon launch on Dream Machine and AWS Bedrock but isn’t available yet.

Google’s Gemini AI can now process video and images simultaneously

Google's Gemini AI has achieved a breakthrough with its AnyChat application, enabling simultaneous processing of live video and static images—a feat previously impossible for AI like ChatGPT. While this functionality is supported via Gemini’s API, it’s not yet available in Google’s official apps. Developers can use simple code to create custom visual applications leveraging this technology. With vast potential, such as helping students with combined textbook and homework queries or aiding artists with real-time feedback, Gemini’s multi-stream AI promises transformative impacts across various fields.

Syntilay unveils first AI-designed 3d printed shoes

Syntilay, a startup led by 25-year-old Ben Weiss, has introduced the first AI-designed 3D-printed shoes. Created using tools like Midjourney and Vizcom AI, these innovative shoes are available in five colors and priced at $149.99. With only a few thousand pairs available, they represent a blend of cutting-edge design and limited-edition exclusivity, highlighting AI's growing role in fashion and product development.

MiniMax launches open-source models with 4M token context window

MiniMax has launched the MiniMax-01 series, featuring MiniMax-Text-01, a language model, and MiniMax-VL-01, a visual multimodal model. These models offer a groundbreaking 4-million-token context window, 20-32x larger than industry leaders. Powered by the innovative Lightning Attention structure, they excel in long-context tasks and rival top-tier models like GPT-4 and Claude-3.5, setting a new standard in AI capabilities.

TRENDING TOOLS

21st.dev > Github + Pinterest to make your AI websites look beautiful. (link)
Fullmoon > Chat with private and local large language models. (link)
Shapen > Create 3D models from images and text descriptions. (link)
Ontosight > Make research faster and smarter with AI-driven discovery, understanding, and analysis. (link)

THINK PIECES / RESOURCES

Build everything with AI agents: Here's how by David Ondrej. (link)
Robocop? DARPA aims to develop AI tools for spotting, and even anticipating, money laundering activities before they occur. (link)
Why “human-AI symbiosis” is essential for business and society. (link)
57 episodes in the astonishing 70-year history of AI. (link)
AI mistakes are very different than human mistakes. (link)
Artificial intelligence: what five giants of the past can teach us about handling the risks. (link)

CONTENT CORNER

That’s all for today’s issue, folks.

Today in AI

Discussion about this post

Ready for more?