Dec 1, 2023 11 min read

Razor's Edge: Elastic Notes

So, for those looking to understand this more here is some of the work that got me so interested in this. Just some excerpts from my writeup pitch and notes from some of the expert calls on the relevant topic. I will say what was fascinating about this was some of the real bad or stale sell-side takes. One analyst was throwing serious cold water on spend benefits from vectorization at existing customers which took about 2 calls to realize he was clueless. Also, a lot of sell-side seemed to really think that cause this didn't take off same window as chatgpt that it was basically fluff/hype which was also quickly disproven. In fact, so much so that the inflection point on entp search side for gen AI basically landed sometime in October. Not really going to blame them on this as Sept/Oct where so brutal market wise that nobody was looking for a product market fit moment clearly and that is what made this setup so fabulous and this work so compelling.

As far as summary of checks ahead of the print here are ones I did..

Call w Infra lead large US financial

-Evaluating ESRE and likely to adopt q1/q2 2024

- Will likely boost spend 60-70% w them

Partner check:

-$25ml Elastic Biz

-15% 2022/ 15-20% 2023/

“over 20% in 2024 for sure”

-License upgrades and usage driving it.

Partner check :

-“tens of millions” Elastic Biz

- Growing 15-20%

-Ticked up over that last two months

-More interest in vector search attributed

Partner Check:

-$15ml elastic biz

-“teensish growth most of last year but starting to trend well above that past quarter”

-no particular use case just strength on security side

- a lot more interest around enterprise search past month

As for color on the AI beneficiary topic and whether it is needle moving on spend there is a bunch of stuff but here is a highlight reel...

AI Engineer On Vector Search:

“So, vector search is so. There are $30 billion business manually done by people who hate their job. So, in order to have a first preview of whatever needs to be published. You need to work for three months and the people writing it as contractors. They are not even from your field. So, the quality is terrible. On another hand, well, you have Generative AI so, you can generate things. But obviously, these people, they need to read thousands and thousands document.

There's no way near place where a large language model can take in input hundreds, thousands of documents or not even like 15 documents or 50 documents or even 10 documents already too much. So, what I did was that I split the problem into many different problems. I trained models to generate a structure of the publication we will do and in every support, I had to generate something. And what I did is that I use a vector search technology to rank the best document for each subsection.

So, it becomes a section to generate with one or two of the best documents. And why I am now able to with a very high level of certainty capable of providing this document election? It's because this technology actually solved meaning. What is important in vector persistence is that it stores the meaning of words, not the words themselves. Vectors are closer to what we think.

So, if I say my best memory and you say your best memory, we're not talking about the same thing, but there's an approximate agreement where it means pretty much the same thing without being different or if I tell you, a catch-up later or let's talk later, it's actually the exact same meaning.

And these vectors with beyond just synonyms, they will persist in and that's the reason why it becomes popular because if you type something different in terms of like idea or opinion, you will find synonyms in terms of like sentences or argumentation, groups or vector combining and being close to what you need. While sometimes you don't need that, you need a serial ID for one device. And it doesn't really matter, you never have any synonyms and there is no meaning to that, but you need to have the exact reference for the piece you need right now.

And we're indexing some historical paper because it was something that was sold in the '60s and so, an old car or an old plan and you need the exact reference to refactor it well. Now you need the traditional search without any vector. So, these are the two cases where you see that you need both technologies

Got it. So, this can be a really huge opportunity for Elastic, like when I think about having those two capabilities on one platform or do you think Elastic is not well positioned as I'm assuming?

I can tell you like purchasing vectorization is a huge opportunity. I wrote a paper about the future of machine learning models, seven to eight years. It was a confidential one because I wanted to build something very similar to the large language model or the same model that we have now, except that you would be trained in parallel by multiple model, but the principle is pretty much simple.

And actually, when you see how they trained GPT-4 is actually exact what I described a few years ago, meaning it was obvious that at a certain point, you need to persist vectors and meanings and ways rather than the optimal thing and it's a huge opportunity for Elastic because Elastic has proven that they scale. There are specialized company. You have companies like Algolia indexing stuff. You have other search technologies, but Elastic, they are open source that are scalable, they can be deployed anywhere.

And the community know them. The alternatives are libraries that you would maintain yourself and embed into your code and things like that. Honestly as you run a business, you rather trust a company where you have maintenance, they have already done it. You can trust them for a long-term problem. So, there's a port for Elastic like without any question. They made the move at the perfect time.

Enterprise Search Focused Consulting Practice:

Q: your views on ESTC and search market / vector search trends?

A: OpenAI/ChatGPT is the most significant disruptor of the business in 20yrs. the effect it had was to confuse the market and has been disruptive to our vendors b/c they tried to preach and promote their relevancy and here is OpenAI which is quasi open source which does a pretty good job answering questions… (so caused some confusion in 1H23)

ESTC has supported vector search for quite some time but operationalizing that is the challenge; how do you make your vectors, what should you vectorize, etc.

Overall the techniques and strategy of what we would do previously to do effective search – many of them are clearly transferable to vectors and GenAI. Ex: having clean data, indexing data, etc. what has been disruptive is how ppl will use it, what to expect, do they have budgets for it.

We see ESTC as set for a significant increase in revenue b/c of this. Vectors take up a lot of data. Processing vectors take up a lot of CPU cycles which is effectively how ESTIC prices. So the more data that you store, the more you spend.

Companies want vector search but may not be completely aware of the spend implications yet…..a small business can pay ESTC a couple hundred dollars a month to have search on their websites which is amazing. But if you want to do vector search it doubles the cost just out of the box. If you are crawling websites constantly to reindex the data with vectors, it’s significant upsizing in spend.

We have a customer spending $1m per year on ESTC with a B2B use case. They were indexing the product catalog for search and tried to do vector search but realized their cost would be 2x higher. So they deferred the implementation of vectors until they could index their product catalog on an incremental basis just the changes not the full catalog. So then they will implement vectors after b/c they see billions of $s of upsell they can do from vector search

Q: overall ESTC spend growth outlook? Compare to last couple years?

A: we are seeing significant increases in customer spend. 4yrs ago we were struggling to get budgets of $50K ARR per year on average. Now when we’re selling new deals we are in 6 figures $150K ARR, it has increased by 3x.

We’re looking at ~40% YoY subscription growth YTD, a lot of it driven by vector search. 1H was much slower. Initially ppl were hoping ChatGPT would reduce all their search spend and they were all obsessed with that, but then they realized the opposite is true, need vector search b/c can’t run GenAI effectively otherwise

As there are more AI features built into the search product, we’ve seen 2-3x subscription revenue increase that our customers pay if they are taking advantage of those. If they are not it’s going the opposite direction, following Moore’s Law, cost per unit always goes down, and ESTC help reduce your spend.

There is competing free edition of elastic search from AMZN so ESTC constantly has to climb that escalator down. Therefore they always have to justify their existence – anyone who sells commercial version of open source software has to constantly innovate to grow. Open source models trail commercial models by ~12-18months

Vp Fraud Intelligence at Financial Customer spending $300k/yr on spend implications around elastic vector search adoption:

“So with the introduction of vector databases, it is going to probably require us to increase our resources by 50%. So I expect the prices to go up by 50% at least on that. Any other year, the 20% would be normal. Just over the course of time for that with no changes. But as we acquire other companies, which increases the transaction count, increases the number of cardholders and activities, we do have additions every now and then to cover the extra.

Existing Financial Customer On ESRE And Use Case:

“As soon as I heard the announcement on that, we needed to try it. So right now, we're experimenting with it on the observability side because it being a vector database, we can share out that data for external analysis, whereas traditionally, we couldn't. Even taking something like a credit card number and tokenizing it, there are still patterns that could be derived from that data that it could be used to identify a person. So you want to avoid that. So using something like a vector database looks like that's going to help us with that issue. And I think we'd really like to expand that for like natural language queries.

So say I'm looking for a particular pattern and the data for a certain type of unique fraud that's occurring, describing the types of events I'm looking for and having it produce the output will be equal to what I can do today. However, that opens it up to additional analysts that are perhaps not technical enough to be able to write complex queries, so kind of increases the audience size on that one. And just being in the business, we'd really love to see something like natural language searches for like the IVR.

So if you call in to get your balance on your credit card, the phone, instead of following like a phone tree and going down that rabbit hole, taking several minutes, I wanted to ask the question from the start, get their authentication. So that would overall increase satisfaction from the cardholders and also our clients, but also decrease the amount of time on the phone.

A lot of organizations outsource their IVR, and I have experience with that as well. You pay by the minute for the call. So anything that we can do to enhance that and lower that minute time or lower those seconds is always a good thing, especially when it brings more satisfaction for our clients.

Customer Confirming Elastic Incumbency Advantage:

“So we haven't really looked outside yet. We looked at Elastic just because we were licensed for it and we have the resources. And we ran a test. So Visa and Mastercard have pretty strict requirements in how fast you need to approve a transaction. And of all the things that need to happen, I get a little slice of that time, 600 milliseconds for every transaction. So using traditional search, if we were to use that for those transactions, it would exceed 600 milliseconds. It will be almost a second, just slow.

While using, we did run some experiments and found that the vector database in Elastic, ESRE, would allow us to come in at sub-600 millstones. It was around 400. So that was the first reason why we wanted to deploy this product as well. We haven't looked outside yet.”

Call Comment on Competitive Concerns:

“Everyone is trying to grasping for straws to figure out what they can do. This is an easy one. I have Elastic. I've had it for many years. I like the product. I go to vector search. I got AI. And I have the product. It's very easy to do that. It gives me an easy pull to AI”.

Excerpt from Tegus Call w AI Expert on SaaS Names Set to Benefit or Not

Yes. I'll just do a lightning round on those guys, if you like, just because we get a lot of them. Hashi, we love them, love them to death. I just don't know how they're going to make money. They just have no PLG strategy to sell up the stack. Developers love them, but there's nothing forcing you to spend more money going out the stack. So they need to get their act together around product that growth, if you know what I mean by PLG.Mongo to me is the only enterprise company in my opinion that actually truly understands product-led growth and they mastered it. Elastic was doing really well. I don't know if they're on your list at all. And they were doing really well and then they lost a bunch of their leadership and after the IPO, and they sat in lot of things. But they kind of settled their lows.Elastic to me is a super hit for next year, but this is potential. So many things could go wrong. Elastic is very special in one aspect that they've already served the enterprise. They know where all the data lives. They know what the data is. That is the ideal index to build an LLM prop and they blew it.They already blew it in one sense that they handled this market, and they went and bought their own vector database and whatnot. And they've had this thing for over a year, which is crazy, but they don't really understand their developers. And as their partner, I love them. We use them for NLP, but we're screaming and waving our hands saying, "You're missing the AI market, it's something that's coming."Even NLP, we show them how you could make money off the NLP market, which was the precursor for LLMs. We've been doing it ever since 2016 and loving their product for it, but they didn't listen and they don't understand the AI market. But they still might succeed famously because if developers realize that this is the yellow pages of all data in the enterprise that they could use as their embeddings, this is going to, Elastic is the ideal product for that. But it's elastic to lose, whether they really win the developer market, but also if they up their game.If they arrogantly assume that they understand AI and they try to build this organically, they're going to blow it. But if they really built a partner network and work with start-ups and really lean into the AI market and not just DevRel, but they’re probably also able to make some acquisitions, they could be the big sleeper hit right now. Any questions on them? Or I'll go on to Confluent. I'll give you a couple of quick comments on them.

Totally Unrelated on AI but Notable Data Point on Splunk/Elastic Observability

EXECUTIVE DIRECTOR AND GLOBAL HEAD AT VERIZON

Our spend with Elastic today is probably in the $5 million to $10 million range. I think that number is going to probably go anywhere from $12 million to $15 million.

Q: Got it. Over what time period?

EXECUTIVE DIRECTOR AND GLOBAL HEAD AT VERIZON

This is going to happen in the next six to 12 months. We are already in the process of migrating quite a bit of our logging infrastructure and logging workloads to Elastic Stack.