Why Not AI?

When Tim and I were first talking about Wirespeed, we originally thought about buying wirespeed.ai. The domain was available, it looked cool, and matched all the hype going around these days. But it begged the question, what about our company would actually be AI? A couple ChatGPT calls with few-shot prompts and a lot of crossed fingers? Our problems came down to a few common areas, highlighted below.

#But First

A lot of people have thought I’m anti-AI because of how I’ve talked about it. AI is really amazing and I’ve spent a considerable amount of time learning neural networks, LLMs, and trying to apply them to cybersecurity. But there’s a difference between an AI company and a company using AI. The simple filter is, did your company exist and focus on AI before November 30, 2022? No? Then you’re chasing hype. If you did focus on it before then, I applaud you and wish you massive success. I’ll give an honorable mention to legitimate AI start ups that have joined the fight since '22 as well.

I have friends that built neural networks for large credit card providers to detect fraud way before it was cool. Those companies rock and you won’t find “AI” anywhere on their landing page.

Bad Things Come in Threes

#Repeatablility

The single most important problem I have with using AI as a foundational component of any computing system is repeatability. If you’re like a majority of companies riding the hype, you’re going to use a hosted model from Anthropic or OpenAI. The first thing I learned back in my college Java 101 class was unit testing; how to write code and make sure it works the way you expect. The problem with AI is it’s non-deterministic and has intentional randomness, which means it’s hard to get the exact same results every time.

So let’s write some unit tests for a function that calls OpenAI, shall we. We’ll be using BunJS, because after all, I do still chase some hype.

#Ground Rules

I can already hear people saying “But you can control the temperature and seed of the model!”. So we’ll set those to the most deterministic values, and see what happens.

#Set up

$> bun i openai

#Code

import { it, expect } from "bun:test";
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: "sk-proj-xxx",
});

// Number of iterations to test our prompt against
const ITERATIONS = 10;

// Text the prompt will attempt to summarize
const TEXT_TO_SUMMARIZE = `Because adversaries are speeding up, you need an approach to triaging alerts that runs at Wirespeed. Our algorithm performs all the steps that a team of senior analysts would take, but our execution time is in seconds, often less than one. When seconds count, alerts should not be waiting in a queue for a resolution.`;

// Call OpenAI and return first completion
async function callOpenAI(): Promise<string | null> {
  const completion = await openai.chat.completions.create({
    messages: [
      {
        role: "system",
        content:
          "You are a summarization assistant, you will be provided with text and provide a 1 sentence summary of it. Your goal is to be consistent and create summaries that are repeatable.",
      },
      { role: "user", content: TEXT_TO_SUMMARIZE },
    ],
    model: "gpt-4o-mini",
    temperature: 0,
    seed: 42,
  });

  return completion.choices[0].message.content;
}

// Get 1 base value to compare against
const expected = await callOpenAI();

// Let her rip
for (let i = 0; i < ITERATIONS; i++) {
  it("calls openai consistently", async () => {
    const result = await callOpenAI();
    expect(result).toEqual(expected);
  });
}

#Results

$> bun test
bun test v1.1.18 (5a0b9352)

index.spec.ts:
✓ calls openai consistently [1330.09ms]
36 |
37 | // Let 'er rip
38 | for (let i = 0; i < ITERATIONS; i++) {
39 | it("calls openai consistently", async () => {
40 | const result = await callOpenAI();
41 | expect(result).toEqual(expected);
^
error: expect(received).toEqual(expected)

Expected: "The algorithm enables rapid triaging of alerts at Wirespeed, executing the necessary steps in seconds to ensure timely resolution without delays."
Received: "The algorithm enables rapid triaging of alerts at Wirespeed, executing tasks typically handled by senior analysts in under a second to ensure timely resolution."

      at /Users/jreynolds/Downloads/blog/index.spec.ts:41:20

✗ calls openai consistently [1299.73ms]
✓ calls openai consistently [988.63ms]
36 |
37 | // Let 'er rip
38 | for (let i = 0; i < ITERATIONS; i++) {
39 | it("calls openai consistently", async () => {
40 | const result = await callOpenAI();
41 | expect(result).toEqual(expected);
^
error: expect(received).toEqual(expected)

Expected: "The algorithm enables rapid triaging of alerts at Wirespeed, executing the necessary steps in seconds to ensure timely resolution without delays."
Received: "The algorithm enables rapid triaging of alerts at Wirespeed, executing in seconds to ensure timely resolution without delays."

      at /Users/jreynolds/Downloads/blog/index.spec.ts:41:20

✗ calls openai consistently [1089.58ms]
36 |
37 | // Let 'er rip
38 | for (let i = 0; i < ITERATIONS; i++) {
39 | it("calls openai consistently", async () => {
40 | const result = await callOpenAI();
41 | expect(result).toEqual(expected);
^
error: expect(received).toEqual(expected)

Expected: "The algorithm enables rapid triaging of alerts at Wirespeed, executing the necessary steps in seconds to ensure timely resolution without delays."
Received: "The algorithm enables rapid triaging of alerts at Wirespeed, executing in seconds to ensure timely resolution without delays."

      at /Users/jreynolds/Downloads/blog/index.spec.ts:41:20

✗ calls openai consistently [1023.25ms]
✓ calls openai consistently [1332.99ms]
36 |
37 | // Let 'er rip
38 | for (let i = 0; i < ITERATIONS; i++) {
39 | it("calls openai consistently", async () => {
40 | const result = await callOpenAI();
41 | expect(result).toEqual(expected);
^
error: expect(received).toEqual(expected)

Expected: "The algorithm enables rapid triaging of alerts at Wirespeed, executing the necessary steps in seconds to ensure timely resolution without delays."
Received: "The algorithm enables rapid triaging of alerts at Wirespeed, executing in seconds to ensure timely resolution without delays."

      at /Users/jreynolds/Downloads/blog/index.spec.ts:41:20

✗ calls openai consistently [1310.96ms]
✓ calls openai consistently [1146.04ms]
✓ calls openai consistently [1119.85ms]
36 |
37 | // Let 'er rip
38 | for (let i = 0; i < ITERATIONS; i++) {
39 | it("calls openai consistently", async () => {
40 | const result = await callOpenAI();
41 | expect(result).toEqual(expected);
^
error: expect(received).toEqual(expected)

Expected: "The algorithm enables rapid triaging of alerts at Wirespeed, executing the necessary steps in seconds to ensure timely resolution without delays."
Received: "The algorithm enables rapid triaging of alerts at Wirespeed, executing tasks typically handled by senior analysts in under a second to ensure timely resolutions."

      at /Users/jreynolds/Downloads/blog/index.spec.ts:41:20

✗ calls openai consistently [1359.62ms]

5 pass
5 fail
10 expect() calls
Ran 10 tests across 1 files. [13.25s]

I’ll cut out the good part for you

5 pass
5 fail
10 expect() calls
Ran 10 tests across 1 files. [13.25s]

If your developers were right 50% of the time, how long would they last? Notice this is with ZERO temperature and a static seed value. So what gives? There’s a little known feature of most hosted models called the fingerprint. Just like normal SaaS products, LLM providers are constantly trying small tweaks and seeing how they impact performance. Even though the model is gpt-4o-mini you can be routed to different iterations of it across requests.

#Critiques

This is where most prompt engineers would argue you need a multi-stage, agentic, symbiotic, anamorphic pipeline to help smooth these incosistencies, and that’s where I disagree. If your foundation is flawed, you’re going to be constantly fighting it. Not to mention when OpenAI sunsets this model, you’ll have to upgrade and do it all over again! Everything is fair game for change.

#Cost

Expensive. Need I say more?

Even more expensive when you factor in wrapping it with a multi-stage, agentic, symbiotic, anamorphic layer to ensure consistency. Not to mention the cost of wasted time managing the models, handling upgrades, rebuilding the entire pipeline each time a new turn needs to be added to the model, and troubleshooting why all the prompts break.

#Speed

We take the name Wirespeed seriously. It’s not marketing hype, it’s the north star for our company. If we can’t do it at Wirespeed, why are we doing it? In our early testing, our verdict times are measured in MILLISECONDS. I’ve built AI pipelines before for decision making, similar to what we do in our product, and they would require at least 10-15 calls to a chat model to try to replicate (unsuccessfully) what we’re doing. Each call taking a minimum of 1-2 seconds, which is already a 100x DECREASE in how fast we can protect our customers. These are non-negotiable roadblocks for us.

In Summary

Much like the note your high school partner gave you after you broke up, this is written out of love, if just a little bit harsh. I think AI has massive potential, and will be here to stay, but we didn’t see companies redoing their entire GTM to now promote that they’re a “Java company,” a “ReactJS company,” or even a “Mobile company.” Unless you’re the 1% of companies, you’re not an AI company. You’re a company that uses AI.

Now let’s get back in our caves and keep writing code.

Want to see what Wirespeed is all about? Follow us on LinkedIn / X or join our mailing list.