In this clip from The Diff, Meta Engineer Masha Blackburn discusses the #OpenSource unified engine Velox, and how developers can use Velox in their workflow, along with what to avoid. Learn more by checking out the full interview: https://lnkd.in/gEP9Byih
Transcript
What would you say people can use Velox for? So Velox is good if you���� if you want to maybe experiment with a particular�� optimization maybe you're doing some research in���� databases you could use Velox to bootstrap your�� project very quickly. So you can get the core���� algorithm, the core operators like a set of SQL�� relational operators like joins, aggregations,���� window operators, order by. And then you can�� start adding on top you can either create an���� alternative optimized version of some operator�� or you could maybe package it differently and���� use a different distribution mechanism for how�� you distribute your query on different machines.���� You could maybe look into adding�� some custom functions that you,���� you find useful in your workload. So Velox�� gives you a base gives you basic functionality���� and so you can dig in into building new�� features that you are passionate about. Cool. For clarity what would you say�� Velox is not or should not be used for? What Velox should not be used for?�� So Velox is not a full-blown database���� and it's not even a full-blown query engine.�� So Velox is a library, which means that you���� either should use an existing application�� that's built on top of Velox and this way���� you leverage Velox through that application. Like�� a application could be Presto it could be Spark���� or you need to build an application on top of it.�� So you you would not be using Velox standalone���� as is. You would not provide it as is�� to end users. It doesn't speak SQL,���� it expects that the application would translate�� a query either in SQL form or in a form of a���� data frame or really any other input and�� convert it into like a query plan that's���� optimized and ready for execution. So it's more�� of a like low level infrastructure rather than���� something that can be exposed directly to end�� users who are interested to dig into the data.To view or add a comment, sign in