June 17 - AI with Whiskey

#GenAI world on šŸ”„again, with massive announcements in the past couple weeksā€¦ Quick recap as of 17th June.

but first.. a šŸ„ƒWhiskey pairing suggestion.. Godawan 100. Given that I am posting from India this week, wanted to recommend Godawan 100 that won the Single Malt of the Year at 2024 London Spirits. It has a slightly sweet tropical flavor, hints of caramel, charcoal and anise with a long dry finish. The hot dessert climate in Rajasthan, India leads to a higher % of whiskey evaporating as it is aged (called Angelā€™s share) giving it a distinct flavor vs. peers from colder scotland. In recent years Indian whiskey has gained a lot of popularity with Indri, Rampur, Amrut etc. Would love to hear your favorites.

Apple

  • Apple went from never mentioning the word ā€˜AIā€™ on stage to fully embracing and branding it as ā€˜Apple Intelligenceā€™ at WWDC

  • Most features were derivatives of Google etc., but done the Apple way - with focus on frictionless user experience and high bar for privacy. Apple runs most AI on-device, followed by private cloud compute, and in very few cases (with explicit user permission) invoking OpenAIā€™s ChatGPT.

  • Bloomberg reported, no money was exchanged in Apple-OpenAI deal.

  • As part of the developer conference, we got hands-on with Appleā€™s small language models (SLM) that run on-device, and I am truly impressed by their implementation and developer tools on top.

  • The SLMs are specialized for each task using popular LoRA adapters and techniques like speculative decoding, context pruning, group query attention, ironically open-sourced by Google, Microsoft etc.

  • More details, Aha/Uh-Ohs in my post and podcast discussion.

Nvidia

  • ā€˜Jensanityā€™ continues with Nvidiaā€™s meteoric rise past $3T valuation. Nvidia went from $1T to $2T in 269 days, and to $3T in 103 days! Jensenā€™s appearance and Nvidia partnership has become a must-have for any respectable conference keynote.

  • At computex, Jensen announced their roadmap of Blackwell Ultra (2025) and Rubin (2026) platforms and flaunted the future of AI factories, omniverse, digital workers and paid his humbling respect to Taiwan. This 15 min recap is time well spent.

  • Nvidia released Nemotron-4 340B, a family of open-source market leading LLMs optimized for NVIDIA. Provides great tools for customers to generate high quality synthetic training data, supports 50+ languages, 40+ coding languages, and runs on 16x H100s or int4 on 8x H100s.

  • Despite AMDā€™s big announcements of Zen5, XDNA2, 3rd gen NPU, GGPUs and Intelā€™s Xeon6, Gaudi3, LunarLake etc. both stocks are slightly down last month, vs. Nvidia up 40%.

Snowflake

  • Snowflake Summit was another big showcase of AI-everywhere. Ton of new capabilities added to Cortex AI suite, but most still in preview. Like most data-first vendors, Cortex Search offers one-click chat-with-data (grounded RAG patterns), easy indexing of documents etc.

  • In partnership with Nvidia, Cortex AI will also offer NVIDIA NeMo retriever and Triton Inference server, and NIM will include Arctic LLM.

  • Model registry includes most leading open-source models, but lacks access to frontier models from OpenAI, Google, Anthropic. I am particularly waiting for Document AI using Arctic TILT to mature.

  • Iceberg Tables GAed, delivering full storage interoperability and making it easier to use, govern and collaborate on Iceberg data stored externally.

  • Polaris Catalog (being open-sourced in 90 days) provides an easy way to leverage any query engine (eg. Apache Flink, Spark, Dremio, Python etc.) across data stored in open format on AWS, Azure, Google etc.

  • Random audience member brought on to do a live demo was courageous.

Databricks

  • The data world is split over major open table formats: Delta Lake (eg. Databricks), Iceberg (eg. Snowflake) etc. With the acquisition of Tabular (orig creators of Iceberg), Databricks now has the technical expertise to bring the formats closer and drive interoperability. Open-sourcing their Unity Catalog is a big deal. It further accelerates that vision via UniForm and takes on Snowflakeā€™s open-source Polaris catalog.

  • Databricks is continuing to invest heavily in MosaicML (that they acquired last yr for $1.3B) and released previews of a full suite of GenAI products eg. Agent Framework, Evaluation, Model Training, Gateway, Tools Catalog.

  • Photon is Databricksā€™ engine to run extremely fast queries (eg. ETL, streaming, analytics, interactive queries) directly on clientā€™s data lake. They announced a partnership with Nvidia to use CUDA GPU acceleration to supercharge Photon, and their open DBRX LLMs will be available via Nvidia NIM microservices for any-cloud and on-prem.

  • Databricks is leveraging data governance from Unity Catalog to power AI & BI Dashboards and conversational Genie. Plus provides ā€˜Certified Answersā€™ with data lineage in LLM responses.

Encourage you to reach out to Venkat Krishnan for a deeper dive on data.

SAP

  • SAP Sapphire was very consequential annual event for SAP. In 2023, ~44% of SAP revenue came from cloud with 72% margin, and in their 1Q24 earnings, they aspired to make that ~60% by 2025. SAP is going all in on AI, but exclusively on cloud to entice more customers to upgrade. Plus the $1.5B WalkMe acquisition would help with adoption.

  • To maintain their stickiness, SAP needs to make it seamless for clients to bring their favorite AI to the data hosted in SAP vs. take the data out. Same philosophy as Snowflake, Databricks, Salesforce, ServiceNow etc.

  • Joule, SAPā€™s co-pilot is being embedded ubiquitously to enable seamless interactions across events, audience segmentation, HR, planning etc. btw, under the hood, itā€™s IBMā€™s watsonx assistant + large LLM ecosystem.

  • Christian set an audacious goal that by end of the year, Joule will manage 80% of the most common tasks performed by SAPā€™s 300M+ users.

  • SAPā€™s GenAI Hub brings models from IBM granite, OpenAI, Meta, Mistral, Google, AWS, Nvidia etc. For code, SAP fine-tuned open-source Starcoder on 250M lines of ABAP code, 2TB of SAP community data. Plus, demoed a deeper bi-directional partnership w/ Microsoft co-pilot.

  • I would have loved to see more tooling around Responsible AI.

AI video generation

  • So far OpenAIā€™s Sora has been the king of text to hyper realistic videos, but not publicly available till they figure out how to responsibly release it.

  • We saw two new noteworthy competitors:

    • Luma AIā€™s Dream Machine for quick 5 sec videos from text. In my experience it does much better at image-to-video than text-to-video. Not Sora level yet, but completely free and public access available right now.

    • Chinese AI firm Kuaishouā€™s Kling does an excellent job with complex videos eg. people eating noodle, story telling, longer 2 min 1080p videos. Stunning.

      Klingā€™s coolest and scariest thing was its ability to take an image of a person, and make it dance, or follow any stick figure action. While these will be a massive hit on TikTok, I am very worried about the bad actors, and what AI governance is needed to ensure itā€™s not abused. I didnā€™t see any guardrails whatsoever!

New GenAI models

  • Alibabaā€™s Qwen2, crushed leading open-source LLMs beating Mistral AI 8Ɨ22B, Metaā€™s LLama3 on most benchmarks. 5 models from 0.5B to 72B and MoE, with apache 2.0 license and 72B slightly restricted.

  • Mistral released Codestral, for code generation across 80+ languages with impressive benchmarks, but the license terms make it non-usable.

  • Stability.ai open sourced their Stable Diffusion 3 medium. 2B model runs well locally on my Macbook. Their 8B Large model, available via API, does an incredible job at text-to-image for objects and text, but the internet is full of SD3 memes of human figures.

  • Stability.ai released an open-source 1.2B model stable-audio-open that can generate 47 sec text-to-audio clips. I was able to get some legit drum beats, riffs, ambient sounds. Encouraging to see that itā€™s trained on licensed ~500k recordings from Freesound, Free Music Archive.

Misc important AI news

  • Oracle is on a great trajectory, with cloud revenue exceeding 50% of total revenue for first time, embracing a broad GenAI stack and mega partnerships with OpenAI, Google. But the true hero was Oracleā€™s engineer Saurabh, who stole the hearts of US cricket team fans.

  • Mistral AI launched new tools for fine-tuning, on clientā€™s own infrastructure or Mistralā€™s La Platforme serverless as SaaS.

  • Amazon science team released Project PI (Private Investigator) that utilizes AI, computer vision, and generative AI to detect product defects before items are shipped to customers.

  • Meta released a new Comprehensive RAG Benchmark (CRAG) with 4,409 Q&A pairs with mock APIs to simulate enterprise environments. Itā€™s the first time we are able to evaluate your own LLM setup against Copilot, Gemini, ChatGPT, Perplexity across multiple domains.

  • TogetherAIā€™s Mixture of Agents (MoA) uses multiple LLMs in a layered architecture to iteratively improve results and outperform OpenAI GPT 4o.

  • FineWeb, an excellent large-scale 15 trillion tokens (44TB) data set, and high quality FineWeb-Edu were released to help LLM pre-training.

--------
šŸ”” If you like such content, I encourage you to connect on my LinkedIn ā™»ļø Recommend your friends to subscribe to this free ā€˜AI with Whiskeyā€™ newsletter
--------