Intelligent Inference

VOLUME 1 - ISSUE 16 ~ February 20, 2025

In this edition of the “CIO Two Cents” newsletter, I explore how smaller language models, efficient tools, and inference chips are revolutionizing AI with cost savings, enhanced performance, and future-ready innovations.
— Yvette Kanouff, partner at JC2 Ventures

The JC2 Ventures team:
(John J. Chambers, Shannon Pina, John T. Chambers, me, and Pankaj Patel)

Here are the KEY TAKEAWAYS:

Ready for the highlights? Dive into the key takeaways below!

(1)

Smaller language models (SLMs) can be just as powerful as larger ones, offering significant cost savings and enhanced efficiency.

(2)

Inference chips are revolutionizing AI, with the market projected to hit USD $90.6B by 2030.

(3)

AI optimized clouds and open standards are crucial for leveraging future innovations while maintaining accuracy, security, and and compliance.

MY “2 CENTS” FOR THE MOMENT

Artificial Intelligence costs are consuming many of my conversations these days. As companies have adopted AI strategies, the training and run-time costs can be brutal. I’m curious what strategies and products will evolve to help with these costs, so I thought I’d mention some of my current favorites.

SLMs – Bigger isn’t always better. This holds true for AI language models. More parameters in an AI model require more compute, longer processing, and more cost. However, this doesn’t always translate to improved accuracy. The key is to choose the most applicable parameters and train with the right data. Every AI use case doesn’t need a large language model, especially when SLMs or custom language models can save compute when trained on very specific data. Smaller models are showing great applicability and accuracy.

Model Optimization Tools – I am excited about some of the model optimizations that provide better efficiency and outcomes. Options like pruning unused or under-used results or workflows, quantization to focus on weighting factors, and model compression through knowledge distillation to transfer select data to smaller, more efficient models are all effective strategies for improving and making inference more cost-effective.

Efficient Models, Model Merging – The thought of creating efficiency by combining various models into one, optimized model is a great trend. This provides developers the option to take the best of many models and save on training time and resources.

Inference Chips – I have mentioned in the past that I am a fan of inference chips which are separate from math-optimized training chips and provide huge energy savings and speed optimizations. AWS CEO Matt Garman estimates that inference constitutes about half of the work done on AI computing servers today. There is no lack of competition to Nvidia in the inference market. Groq, Cerebras, Positron AI, and Amazon are all touting groundbreaking results with inference chips to compete with Nvidia. The AI inference chip market size is projected to reach USD $90.6B by 2030.

AI Optimized Clouds – There are many cloud solutions that optimize hardware and caching mechanisms for your models. Parallel processing GPUs and optimized inference servers are available.

Tool and Cloud Neutrality – Just as we saw when many companies tied themselves to one cloud provider with tooling, the same applies to AI. By sticking with open standards and non-tied development, companies are likely to leverage future efficiency innovations across various vendors.

I find it fascinating how conversations have shifted from LLMs to tooling and optimization. Getting optimal results with AI is a critical topic. With so many model choices, how can we utilize AI and cost-effective results with maximum accuracy, security, and compliance? That said, let’s not lose focus on what’s most important – the outcomes that help our businesses, services, or products.

What are some of your favorite tools and optimization mechanisms?

Image of the Moment

Your Thoughts on Intelligent Inference

VIEW BLOG ARCHIVE

John ChambersFebruary 24, 2025JC2 Ventures

CIO Two Cents Blog

Dec 4, 2024

Synthetic Data: Love It or Hate It

Dec 4, 2024

What are the advantages of synthetic data? In this edition of the CIO Two Cents" newsletter, Yvette Kanouff considers the uses of this technology and its benefits, as well as some real-world use cases.

Dec 4, 2024

Sep 3, 2024

Another Data Theft Incident - What About Me?

Sep 3, 2024

Have you been a victim of a data breach? In this edition of the "CIO Two Cents" newsletter, I take a look at the rising incidents of data breaches and explore best practices to protect ourselves from data theft. Read on for insights from me - Yvette Kanouff, Partner at JC2 Ventures

Sep 3, 2024

Jun 13, 2024

CIO Thoughts from the Government Perspective

Jun 13, 2024

Dana Deasy, former CIO at the Department of Defense, gives insights on top concerns for government CIOs, as well as recommendations for nontechnical skills needed and tabletop exercises that CIOs and CISOs should be considering.

Jun 13, 2024

Apr 9, 2024

The Rise of AI Assistants for Data

Apr 9, 2024

Yvette Kanouff, partner at JC2 Ventures, takes a look at how Gen AI technology is revolutionizing data management in the workplace.

Apr 9, 2024

Jan 26, 2024

Will We See Cyber Risk Quantification Everywhere This Year?

Jan 26, 2024

JC2 Partner Yvette Kanouff says that 70% of security and risk management leaders are planning to deploy CRQ within the next 2 years.

Jan 26, 2024

Oct 20, 2023

The Transformation of Networking

Oct 20, 2023

The NaaS (Networking as a Service) transformation shifts the concept of highly complex, all open, and virtualized systems to elegant and secure closed systems.

Oct 20, 2023

Aug 2, 2023

The Power of AI vs. the Power of Trust, Models, Architecture, and the App

Aug 2, 2023

AI is wonderful, and it will transform the way we live, work, and play. Companies need to consider now what the AI wave means with regard to trust, ethics, performance, and data sovereignty/privacy.

Aug 2, 2023

Jun 8, 2023

CIO Priorities in an Era of Risk

Jun 8, 2023

The job of a CIO is becoming more difficult, with an expanding list of responsibilities, including spending frugally, providing better data insights, safeguarding integrity and privacy, innovating faster, preventing cyber attacks, motivating talent and figuring out where AI can replace it, supporting IT needs of the company’s business units.

Jun 8, 2023

Jan 31, 2023

The Legacy IT Security Problem

Jan 31, 2023

JC2 Ventures Partner Yvette Kanouff makes the case for startup innovation as a solution to help protect legacy systems from cyberthreats.

Jan 31, 2023

Sep 7, 2022

What Should We Be Doing With Quantum Computing?

Sep 7, 2022

CIO Yvette Kanouff gives advice on how CIO leaders should proceed when it comes to quantum cloud computing services.

Sep 7, 2022

Jul 6, 2022

Helping Our Engineers Succeed

Jul 6, 2022

CIO Yvette Kanouff highlights a few key strategies to help accelerate engineering efforts and build a strong culture where engineers are empowered.

Jul 6, 2022

Apr 28, 2022

An Edgy Future – the Ongoing Pendulum of Central and Decentralized Computing

Apr 28, 2022

CIO Yvette Kanouff explains there are many reasons to augment cloud computing with edge networks, especially now, as we begin to consider what the next generation of the internet will look like.

Apr 28, 2022

Mar 15, 2022

From Finding Talent to Creating Talent

Mar 15, 2022

CIO Yvette Kanouff believes that if companies can turn hiring processes into growth opportunities, they can open up an entirely new era of talent, with potential to turn good companies into great ones.

Mar 15, 2022

Jan 14, 2022

Evolving Customer Care to Customer Love

Jan 14, 2022

CIO Yvette Kanouff explains how to leverage both technology and the human connection to create customer devotion.

Jan 14, 2022

Oct 26, 2021

Managing Your Love/Hate Relationship with Cyber Security in 3 Critical Steps

Oct 26, 2021

Cyber security comes down to measurement, prevention, and recovery. CIO Yvette Kanouff says it is more important than ever to understand trends as well as technological innovation.

Oct 26, 2021

CIO Two Cents Blog

VOLUME 1 - ISSUE 16 ~ February 20, 2025

In this edition of the “CIO Two Cents” newsletter, I explore how smaller language models, efficient tools, and inference chips are revolutionizing AI with cost savings, enhanced performance, and future-ready innovations.— Yvette Kanouff, partner at JC2 Ventures

Ready for the highlights? Dive into the key takeaways below!

(1)

(2)

(3)

Image of the Moment

Your Thoughts on Intelligent Inference

In this edition of the “CIO Two Cents” newsletter, I explore how smaller language models, efficient tools, and inference chips are revolutionizing AI with cost savings, enhanced performance, and future-ready innovations.
— Yvette Kanouff, partner at JC2 Ventures