Skip to content

Add more docs #1455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 51 additions & 51 deletions README.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ You can select the data type for torch tensors in PostgresML by setting the `tor

!!! code\_block time="4584.906 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"model": "tiiuae/falcon-7b-instruct",
Expand Down Expand Up @@ -102,7 +102,7 @@ PostgresML will automatically use GPTQ or GGML when a HuggingFace model has one

!!! code\_block time="281.213 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -136,7 +136,7 @@ SELECT pgml.transform(

!!! code\_block time="252.213 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -167,7 +167,7 @@ SELECT pgml.transform(

!!! code\_block time="279.888 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -204,7 +204,7 @@ We can specify the CPU by passing a `"device": "cpu"` argument to the `task`.

!!! code\_block time="266.997 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -236,7 +236,7 @@ SELECT pgml.transform(

!!! code\_block time="33224.136 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -274,7 +274,7 @@ HuggingFace and these libraries have a lot of great models. Not all of these mod

!!! code\_block time="3411.324 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -306,7 +306,7 @@ SELECT pgml.transform(

!!! code\_block time="4198.817 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -338,7 +338,7 @@ SELECT pgml.transform(

!!! code\_block time="4198.817 ms"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -372,7 +372,7 @@ Many of these models are published with multiple different quantization methods

!!! code\_block time="6498.597"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down Expand Up @@ -410,7 +410,7 @@ Shoutout to [Tostino](https://github.com/Tostino/) for the extended example belo

!!! code\_block time="3784.565"

```sql
```postgresql
SELECT pgml.transform(
task => '{
"task": "text-generation",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ Aside from using this function with strings passed from a client, we can use it

!!! generic

!!! code\_block time="54.820 ms"
!!! code_block time="54.820 ms"

```postgresql
SELECT
Expand All @@ -156,7 +156,7 @@ LIMIT 1;

!!! results

```
```postgressql
CREATE INDEX
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Our search application will start with a **documents** table. Our documents have

!!! code\_block time="10.493 ms"

```sql
```postgresql
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
title TEXT,
Expand All @@ -54,7 +54,7 @@ We can add new documents to our _text corpus_ with the standard SQL `INSERT` sta

!!! code\_block time="3.417 ms"

```sql
```postgresql
INSERT INTO documents (title, body) VALUES
('This is a title', 'This is the body of the first document.'),
('This is another title', 'This is the body of the second document.'),
Expand All @@ -79,7 +79,7 @@ You can configure the grammatical rules in many advanced ways, but we'll use the

!!! code\_block time="0.651 ms"

```sql
```postgresql
SELECT *
FROM documents
WHERE to_tsvector('english', body) @@ to_tsquery('english', 'second');
Expand Down Expand Up @@ -109,7 +109,7 @@ The first step is to store the `tsvector` in the table, so we don't have to gene

!!! code\_block time="17.883 ms"

```sql
```postgresql
ALTER TABLE documents
ADD COLUMN title_and_body_text tsvector
GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body )) STORED;
Expand All @@ -125,7 +125,7 @@ One nice aspect of generated columns is that they will backfill the data for exi

!!! code\_block time="5.145 ms"

```sql
```postgresql
CREATE INDEX documents_title_and_body_text_index
ON documents
USING GIN (title_and_body_text);
Expand All @@ -141,7 +141,7 @@ And now, we'll demonstrate a slightly more complex `tsquery`, that requires both

!!! code\_block time="3.673 ms"

```sql
```postgresql
SELECT *
FROM documents
WHERE title_and_body_text @@ to_tsquery('english', 'another & second');
Expand Down Expand Up @@ -171,7 +171,7 @@ With multiple query terms OR `|` together, the `ts_rank` will add the numerators

!!! code\_block time="0.561 ms"

```sql
```postgresql
SELECT ts_rank(title_and_body_text, to_tsquery('english', 'second | title')), *
FROM documents
ORDER BY ts_rank DESC;
Expand Down Expand Up @@ -201,7 +201,7 @@ A quick improvement we could make to our search query would be to differentiate

!!! code\_block time="0.561 ms"

```sql
```postgresql
SELECT
ts_rank(title, to_tsquery('english', 'second | title')) AS title_rank,
ts_rank(body, to_tsquery('english', 'second | title')) AS body_rank,
Expand Down Expand Up @@ -230,7 +230,7 @@ First things first, we need to record some user clicks on our search results. We

!!! code\_block time="0.561 ms"

```sql
```postgresql
CREATE TABLE search_result_clicks (
title_rank REAL,
body_rank REAL,
Expand All @@ -250,7 +250,7 @@ I've made up 4 example searches, across our 3 documents, and recorded the `ts_ra

!!! code\_block time="2.161 ms"

```sql
```postgresql
INSERT INTO search_result_clicks
(title_rank, body_rank, clicked)
VALUES
Expand Down Expand Up @@ -289,7 +289,7 @@ Here goes some machine learning:

!!! code\_block time="6.867 ms"

```sql
```postgresql
SELECT * FROM pgml.train(
project_name => 'Search Ranking',
task => 'regression',
Expand Down Expand Up @@ -336,7 +336,7 @@ Once a model is trained, you can use `pgml.predict` to use it on new inputs. `pg

!!! code\_block time="3.119 ms"

```sql
```postgresql
SELECT
clicked,
pgml.predict('Search Ranking', array[title_rank, body_rank])
Expand Down Expand Up @@ -389,7 +389,7 @@ It's nice to organize the query into logical steps, and we can use **Common Tabl

!!! code\_block time="2.118 ms"

```sql
```postgresql
WITH first_pass_ranked_documents AS (
SELECT
-- Compute the ts_rank for the title and body text of each document
Expand Down
8 changes: 4 additions & 4 deletions pgml-cms/blog/mindsdb-vs-postgresml.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ For both implementations, we can just pass in our data as part of the query for

!!! code\_block time="4769.337 ms"

```sql
```postgresql
SELECT pgml.transform(
inputs => ARRAY[
'I am so excited to benchmark deep learning models in SQL. I can not wait to see the results!'
Expand Down Expand Up @@ -124,7 +124,7 @@ The first time `transform` is run with a particular model name, it will download

!!! code\_block time="45.094 ms"

```sql
```postgresql
SELECT pgml.transform(
inputs => ARRAY[
'I don''t really know if 5 seconds is fast or slow for deep learning. How much time is spent downloading vs running the model?'
Expand Down Expand Up @@ -154,7 +154,7 @@ SELECT pgml.transform(

!!! code\_block time="165.036 ms"

```sql
```postgresql
SELECT pgml.transform(
inputs => ARRAY[
'Are GPUs really worth it? Sometimes they are more expensive than the rest of the computer combined.'
Expand Down Expand Up @@ -209,7 +209,7 @@ psql postgres://mindsdb:[email protected]:55432

And turn timing on to see how long it takes to run the same query:

```sql
```postgresql
\timing on
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ All system statistics are stored together in this one structure.

!!! code\_block

```sql
```postgresql
SELECT * FROM pg_stat_sysinfo
WHERE metric = 'load_average'
AND at BETWEEN '2023-04-07 19:20:09.3'
Expand Down Expand Up @@ -97,7 +97,7 @@ In the case of the load average, we could handle this situation by having a tabl

!!! code\_block

```sql
```postgresql
CREATE TABLE load_average (
at timestamptz NOT NULL DEFAULT now(),
"1m" float4 NOT NULL,
Expand All @@ -112,7 +112,7 @@ This structure is fine for `load_average` but wouldn't work for CPU, disk, RAM o

!!! code\_block

```sql
```postgresql
CREATE TABLE load_average (
at timestamptz NOT NULL DEFAULT now(),
"1m" float4 NOT NULL,
Expand All @@ -132,7 +132,7 @@ This has the disadvantage of baking in a lot of keys and the overall structure o

!!! code\_block

```sql
```postgresql
CREATE TABLE load_average (
at timestamptz NOT NULL DEFAULT now(),
"1m" float4 NOT NULL,
Expand Down
2 changes: 1 addition & 1 deletion pgml-cms/blog/postgres-full-text-search-is-awesome.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ These queries can execute in milliseconds on large production-sized corpora with

The following full blown example is for demonstration purposes only of a 3rd generation search engine. You can test it for real in the PostgresML Gym to build up a complete understanding.

```sql
```postgresql
WITH query AS (
-- construct a query context with arguments that would typically be
-- passed in from the application layer
Expand Down
Loading
Loading