Introducing Firebase Data Connect

May 14, 2024

We started Firebase back in 2012 with a single product: a NoSQL database that you could access directly from your web and mobile apps. To this day, we still love NoSQL databases, such as our own Firestore and Realtime Database, for the scalability they offer.

But sometimes you have a use case that just screams for a relational data model, and over the past few years we’ve seen a healthy ecosystem of SQL-based backend services spring up.

Today, we’re introducing Firebase Data Connect, our new backend-as-a-service powered by a Cloud SQL Postgres database that’s high performance, scalable, and secure. You provide your app’s data model through a GraphQL-based schema and queries, and Data Connect creates secure endpoints and typesafe SDKs to access your data. With Data Connect, you can build rich queries with relational joins, complex conditions, and even semantic vector search that are as secure as hand-coded server endpoints.

How does Data Connect work?

There are three parts to Data Connect: schema, queries and mutations, and strongly typed generated client SDKs. The schema defines your data and the relationships between them, while queries and mutations let you interact with your data. Below, we’ll step through each one along with some examples.

Schema

With Data Connect, we want you to focus on application logic rather than keeping your SQL database, app server, and data access code in sync. That’s why every app in Data Connect starts with the data model, which you define in a GraphQL schema like this:

schema.gql

type Movie @table(key: "id") {
  id: String!
  title: String!
  releaseYear: Int!
  genre: String
  rating: Int!
}

Copied!

Based on this schema, Data Connect automatically generates the PostgreSQL DDL to create tables and helps you migrate your database over time based on changes to the app schema.

Queries and Mutations

Now, you’ll want to show some of this data in your mobile or web applications. To do that, you’ll first create a GraphQL query to get the list of movies:

query.gql

query ListMovies @auth(level: PUBLIC) {
    movies {
      id
      title
      releaseYear
      genre
      rating
    }
  }

Copied!

Data Connect exposes each predefined query through its API server, and client applications can only execute those specific queries.

In your app, you’ll also need to change the data stored in the database - add, edit, update, and/or delete data. In Data Connect, such operations are called mutations. To ease your development, Data Connect automatically generates predefined queries and mutations based on the types and type relationships in your schema, while giving you the flexibility of ad hoc operations in specific cases.

Relations

You can also define relationships between types in your schema. For example, here’s the Actor type:

schema.gql

type Actor @table {
   id: String!
   name: String!
}

Copied!

Each movie can have multiple actors and each actor can play in multiple movies, and a MovieActor table can model that many-to-many relationship:

schema.gql

type MovieActor @table(key: ["movie", "actor"]) {
   movie: Movie!
   actor: Actor!
   role: String!
}

Copied!

Having this updated schema, we create a query that returns information from both of these types with a condition from the user. In this example, you’ll get all actors matching a user-specified last name and include their 10 most recent movies.

query.gql

query ActorsByLastName($lastName: String!) {
  actors(where: {name: {endsWith: $lastName}}) {
    id, name
    movies: movies_via_MovieActor(limit: 10, orderBy: [{releaseYear: DESC}] {
      id title releaseYear genre rating
    }
  }
}

Copied!

Here again, Data Connect ensures that the:

Postgres database on Cloud SQL is up-to-date with the schema changes
API server gets updated with the additional query
auto-generated SDK gets updated with the new query and the correct data types

Generated Typesafe Client SDKs

As you design the schemas, queries, and mutations, you’ll deploy to your Data Connect service. In parallel, custom web/mobile SDKs will be generated, letting you call server-side queries and mutations directly from a Firebase app. You can then integrate methods from this SDK into your client logic. For example, here’s the generated JavaScript for the query above:

index.js

import { actorsByLastName } from "@movies/app";

const result = await actorsByLastName({lastName: "Roberts"});

for (const actor of result.data.actors) {
  // actor is a strongly-typed object with all of the properties defined in the 
  // query including a list of movies!
}

Copied!

Data Connect provides you with a developer environment and tooling that lets you prototype your server-deployed schemas, queries, and mutations and generate client-side SDKs automatically, while you prototype.

When you’ve iterated updates to your service and client apps, both server- and client-side updates are ready to deploy.

Semantic Search with Vector Embeddings

With Data Connect, you can generate a vector embedding by calling out to a model in Vertex AI. Since this happens in the Data Connect API server, it will happen automatically for every write operation.

To do this in your schema, you’ll define the field or column that is relevant to the embeddings that you want to generate, and you’ll give it the Vector type.

schema.gql

type Movie @table {
   …
   description: String
   descriptionEmbedding: Vector! @col(size: 768)
}

Copied!

In your mutation, you’ll specify which model you’d like the embedding to use:

mutation.gql

mutation createMovie($title: String!, $description: String!, $genre: String!) {
  movie_insert(data: {
    title: $title,
    genre: $genre,
    description: $description,
    descriptionEmbedding_embed: {model: "textembedding-gecko@003", text: $description}
  })
}

mutation updateDescription($id: String!, $description: String!) {
  movie_update(id: $id, data: {
    description: $description, 
    descriptionEmbedding_embed: {model: "textembedding-gecko@003", text: $description}
  })
}

Copied!

When you query a field that has a Vector embedding type (in this example descriptionEmbedding), what gets returned is an array of floats that usually means very little to the human reading it. But with FDC, you can compare the embeddings to match for similarity.

To perform a similarity search, you’ll have to write a custom query, and specify automatic query naming for vector-related, ${pluralType}_${vectorFieldName}_similarity.

query.gql

query SearchMovies($query: String!, $minYear: Int) @auth(level: PUBLIC) {
  movies: movies_descriptionEmbedding_similarity(
    compare_embed: {model: "textembedding-gecko@001", text: $query},
    where: {releaseYear: {gte: $minYear}},
    limit: 5) {
      id, title, description, releaseYear
    }
 }

Copied!

How do I get started?

Data Connect is not ready for production use yet, but we’re excited to give you a sneak peek already and invite you to join our gated preview where we’ll be rolling out invites in the coming weeks and months. In the meantime, you can watch the video or read our documentation to learn more!

The Firebase Blog