Jump to Content

Defining the technology of today and tomorrow.

Philosophy

We strive to create an environment conducive to many different types of research across many different time scales and levels of risk.
Learn more about our Philosophy

Philosophy

People

Our researchers drive advancements in computer science through both fundamental and applied research.
Learn more about our People

People

Teams

Our research teams have the opportunity to impact technology used by billions of people every day.
Learn more about our Teams

Teams
AI/ML Foundations  & Capabilities

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Machine Intelligence

Machine Perception

Machine Translation

Natural Language Processing

Speech Processing

Algorithms & Optimization

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Algorithms & Theory

Data Management

Data Mining & Modeling

Information Retrieval & the Web

Computing Paradigms

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Distributed Systems & Parallel Computing

Hardware & Architecture

Mobile Systems

Networking

Quantum Computing

Robotics

Security, Privacy, & Abuse Prevention

Software Engineering

Software Systems

Responsible Human-Centric Technology

Human-Computer Interaction & Visualization

Responsible AI

Human-Computer Interaction & Visualization

Responsible AI

Science & Societal Impact

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Climate & Sustainability

Economics & Electronic Commerce

Education Innovation

General Science

Health & Bioscience

Explore research areas
Projects

We regularly open-source projects with the broader research community and apply our developments to Google products.
Learn more about our Projects

Projects

Publications

Publishing our work allows us to share ideas and work collaboratively to advance the field of computer science.
Learn more about our Publications

Publications

Resources

We make products, tools, and datasets available to everyone with the goal of building a more collaborative ecosystem.
Learn more about our Resources

Resources
Shaping the future, together.
Collaborate with us

Student programs

Supporting the next generation of researchers through a wide range of programming.
Learn more about our Student programs

Student programs

Faculty programs

Participating in the academic research community through meaningful engagement with university faculty.
Learn more about our Faculty programs

Faculty programs

Conferences & events

Connecting with the broader research community through events is essential for creating progress in every aspect of our work.
Learn more about our Conferences & events

Conferences & events

Collaborate with us
Careers
Blog

Machine Perception

Research in machine perception tackles the hard problems of understanding images, sounds, music and video. In recent years, our computers have become much better at such tasks, enabling a variety of new applications such as: content-based search in Google Photos and Image Search, natural handwriting interfaces for Android, optical character recognition for Google Drive documents, and recommendation systems that understand music and YouTube videos. Our approach is driven by algorithms that benefit from processing very large, partially-labeled datasets using parallel computing clusters. A good example is our recent work on object recognition using a novel deep convolutional neural network architecture known as Inception that achieves state-of-the-art results on academic benchmarks and allows users to easily search through their large collection of Google Photos. The ability to mine meaningful information from multimedia is broadly applied throughout Google.

Recent Publications

TextMesh: Generation of Realistic 3D Meshes From Text Prompts

Christina Tsalicoglou

Fabian Manhardt

Alessio Tonioni

Michael Niemeyer

Federico Tombari

3DV 2024 (2024)

Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis

Alessandro Bissacco

Michalis Raptis

Winter Conference on Applications of Computer Vision 2024 (2024) (to appear)

MetaMix: Meta-state Precision Searcher for Mixed-precision Activation Quantization

Han-Byul Kim

Joo Hyung Lee

Sungjoo Yoo

Hong-Seok Kim

Proc. The 38th Annual AAAI Conference on Artificial Intelligence (AAAI) (2024)

SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes

Delitzas Alexandros

Ayça Takmaz

Federico Tombari

Marc Pollefeys

Francis Engelmann

CVPR (2024) (to appear)

PhoMoH: Implicit Photo-realistic 3D Models of Human Heads

Mihai Zanfir

Thiemo Alldieck

Cristian Sminchisescu

International Conference on 3D Vision (2024)

LFM-3D: Learnable Feature Matching Across Wide Baselines Using 3D Signals

Arjun Karpur

Guilherme Perrotta

Ricardo Martin-Brualla

Proc. 3DV'24 (2024) (to appear)

Some of our teams