By Eyal Traitel for Beyond The Blocks - Monday, December 05, 2016

storage performance optimization

I get to talk to customers quite often and when the conversation turns towards storage performance, and we talk about 120K IOPS and sub-millisecond latency in OLTP workloads coming off a 2U array that holds more than a hundred terabytes, I typically get asked "how comes it's that fast?", or "what is the impact on performance of dedupe, compression, data protection?".

I have been asked these questions several times as our system’s performance really does surprise a lot of people...

Data optimization has several layers to it. But here I describe the four key elements of our architecture that allow us to deliver these IOPS and latency levels:

1.Fast-tier-first (some call it flash-first)

Providing high performance with optimal use of costly media is the way to go.

Our system services write from the fastest tier. Today this tier consists of a relatively small number of enterprise-grade MLC drives, making it as fast as all-flash arrays without the need for a large number of these high-performance devices. It's just what you need to get the performance - and like I like to say - “the customer pays for everything”.

2.Everything is deduped and compressed in memory cache

We designed our data path to always dedupe and compress data in-memory - right when the data arrives. This means that all data is optimized, no matter if it’s in volumes, clones or history. Data reduction occurs across the entire pool of blocks.

It means that we can store more in the first fast tier (eMLC today), which also means that you can read everything expected to be read on a daily workload from that tier.

3.Focus on real workloads

It’s interesting to see so many vendors (even those who'd like to announce that vendors should only quote performance figures for such and such block sizes), are all quoting cached numbers.

Customers have real workloads. So, we designed, optimized and tested our code, not in the aims of beating benchmarks - but for real, random workloads with varying IO sizes, and our customers serve as the proof of our claims, showing incredible improvements from our system.

4.Focus on IO path design and code efficiency

At the end of the day, all storage systems use similar hardware - similar kinds of CPUs, RAM etc. When we designed our handling of IOs, we already had in mind dedupe, compression, and flash.

We brought it all together into our new architecture called TimeOS.

This has had much greater effect on performance than anything else and is the key differentiator from either HW-focused designs or disk-based architectures without in-line data reduction (ancient!).

If you want to read more about our performance, I recommend you to take a look at our customer case studies...

New Call-to-action

Got any comments, please submit them below.

Eyal Traitel

Written by Eyal Traitel

Eyal has 20 years of global experience in diverse roles in IT, Support, Engineering, Sales, Consulting and Product Marketing, in diverse locations (Israel, Netherlands and United States). He is highly skilled in system administration, SAN and NAS concepts, design, implementation and architecture. Experienced in pre-sales activities such as RfPs, demos and presentations and marketing events, competitive analysis, product gap analysis and product positioning planning.

Want to comment on this blog post?