The first generation of models pretrained on Common Corpus.
PleIAs
company
AI & ML interests
Open Science LLMs
Organization Card
PleIAs is a French private AI Lab training the next generation of Language Models for document processing.
PleIAs is committed to open science and has coordinated the release of some of the largest open corpus for pre-training.
For more information, visit our website : https://pleias.fr/
Contact us : [email protected]
models
16
PleIAs/Pleias-1b-Preview
Updated
•
268
•
1
PleIAs/classifier_transcription
Updated
•
14
PleIAs/Headlines-OCR-Correction
Updated
PleIAs/classifier_bandeaux
Updated
•
15
PleIAs/journaux-lm-v1
Updated
•
43
•
2
PleIAs/OCRonos-Vintage-CT2
Updated
•
11
PleIAs/celadon
Text Classification
•
Updated
•
270
•
20
PleIAs/Cassandre-RAG
Updated
•
147
•
6
PleIAs/Segmentext
Token Classification
•
Updated
•
132
•
13
PleIAs/Florence-PDF
Updated
•
68
•
4
datasets
41
PleIAs/common_corpus
Viewer
•
Updated
•
397M
•
58.6k
•
167
PleIAs/ToxicCommons
Viewer
•
Updated
•
1.96M
•
167
•
6
PleIAs/Openalex-Metadata
Viewer
•
Updated
•
11.7M
•
11
PleIAs/Persian-PD
Viewer
•
Updated
•
1.38k
•
24
PleIAs/Arabic-PD
Viewer
•
Updated
•
1.82k
•
25
PleIAs/Bengali-PD
Viewer
•
Updated
•
3.23k
•
34
PleIAs/Urdu-PD
Viewer
•
Updated
•
2.28k
•
25
PleIAs/Sanskrit-PD
Viewer
•
Updated
•
3.91k
•
10
PleIAs/Catalan-PD
Preview
•
Updated
•
12
PleIAs/Multilingual-PD
Viewer
•
Updated
•
32.6k
•
344
•
5