Developer and entrepreneur David Ondrej demonstrates Grok 4, xAI’s latest AI mode. It outperforms competitors like OpenAI, Anthropic, Meta, and Google on benchmarks such as GPQA (Google-proof questions), advanced math, and coding.
He highlights three variants: Grok 4 Base (no tools), Grok 4 Base with tools, and Grok 4 Heavy ($300/month), a multi-agent system where four parallel agents collaborate to solve problems, achieving PhD-level performance across fields.Key features include native multimodality (text and image processing), a 256k context window, function calling, and structured outputs.
David speculates that an earlier Grok 3 version’s unrestricted behavior (Mechahitler) was a marketing ploy to generate hype.
He urges viewers to shift from consuming AI content to building startups or agents. xAI is running out of tests due to rapid advancements. Andre predicts xAI and Google DeepMind as leaders in the AGI race, based on talent, data, compute, and execution.He asserts Grok 4 can run simple businesses (e.g., vending machines) and shares early API access results showing top performance on relevant benchmarks. Upcoming releases include a specialized coding model (next month), a multimodal video agent (September), and a video generation model (October).
The bulk of the video is a live demo using Grok 4 (via Cursor IDE and a repo prompt tool) to make UI tweaks to Vectal, Andre’s AI-powered task management startup (used by 55,000+ people). He integrates Grok 4 into Vectal for $20/month (Pro plan), avoiding the $300 Heavy cost. Steps include:Prompting Grok 4 to analyze the codebase and identify relevant files for Kanban board changes.
Using Grok 4 Heavy (via grok.com) for complex tasks like repositioning elements (e.g., priority badges, task names, due dates) to create a minimal, clean design.
Testing simpler tweaks in Cursor’s agent mode, emphasizing precise prompting and “do not change anything else” to avoid over-edits.
Handling Git operations (branching, committing, pushing, creating PRs) to demonstrate tool calling and reasoning.
Iterating on issues like padding removal and conditional rendering (e.g., hiding project names in specific views).
Grok 4’s built-in tool make calls (via reinforcement learning) and multi-agent collaboration, contrasting it with independent variations in tools like Codex. There werw minor integration glitches (e.g., in Cursor, possibly due to recent release) but expects fixes soon.
Grok 4 Heavy excels for advanced problem-solving, he recommends the base version for most users and plans to use it as his default in Cursor and Vectal, alongside Claude for agentic tasks. He promotes his own Vectal tool as a superior alternative to tools like Todoist, ClickUp, or Trello, offering AI agents, custom prompts per project, team plans, and access to top models (e.g., Grok 4, Gemini 2.5 Pro, Claude 3 Opus). He offers personal onboarding for teams and stresses switching for productivity gains. He also plugs his “New Society” community for AI tutorials, startup building (e.g., growing Vectal to $10K+ MRR), and cutting-edge updates.
Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.


Leave a Reply