Google Security Operations 如何丰富事件和实体数据

本文档介绍了 Google Security Operations 如何丰富数据和存储数据的统一数据模型 (UDM) 字段。

为了支持安全调查，Google Security Operations 会从不同来源提取上下文数据，对数据执行分析，并提供关于客户环境中工件的其他上下文。分析人员可以在 Detection Engine 规则、调查搜索或报告中使用经过上下文丰富的数据。

Google 安全运维套件会执行以下类型的扩充：

使用实体图和合并来丰富实体。
计算每个实体并利用表示其在环境中的受欢迎程度的普遍性统计数据进行丰富。
计算特定实体类型在环境中首次出现的时间或最近时间。
使用安全浏览威胁列表中的信息丰富实体。
使用地理定位数据丰富事件。
使用 WHOIS 数据丰富实体。
使用 VirusTotal 文件元数据丰富事件。
使用 VirusTotal 关系数据丰富实体。
注入和存储 Google Cloud Threat Intelligence 数据。

来自 WHOIS、安全浏览、GCTI 威胁情报、VirusTotal 元数据和 VirusTotal 关系的丰富数据由 event_type、product_name 和 vendor_name 标识。在创建使用此丰富数据的规则时，我们建议您在规则中添加一个过滤条件，用于标识要包含的特定扩充类型。此过滤条件有助于提高规则的性能。例如，在联接 WHOIS 数据的规则的 events 部分添加以下过滤字段。

$enrichment.graph.metadata.entity_type = "DOMAIN_NAME"
$enrichment.graph.metadata.product_name = "WHOISXMLAPI Simple Whois"
$enrichment.graph.metadata.vendor_name = "WHOIS"

使用实体图和合并来丰富实体

实体图可识别环境中实体和资源之间的关系。当来自不同来源的实体被提取到 Google Security Operations 时，实体图会根据实体之间的关系维护一个相邻列表。实体图通过执行去重和合并来执行上下文扩充。

在去重期间，系统会消除冗余数据并形成时间间隔以创建公共实体。例如，假设存在时间戳分别为 t1 和 t2 的两个实体 e1 和 e2。对实体 e1 和 e2 进行去重操作时，系统会对实体进行去重操作，并且在去重期间不使用不同的时间戳。去重期间不使用以下字段：

collected_timestamp
creation_timestamp
interval

在合并期间，各个实体之间的关系形成一天的时间间隔。例如，假设 user A 的实体记录有权访问 Cloud Storage 存储桶。还有一条拥有设备的 user A 的实体记录。合并后，这两个实体会生成具有两个关系的单个实体 user A。一个关系是 user A 有权访问 Cloud Storage 存储桶，另一个关系是 user A 拥有设备。Google Security Operations 会在创建实体上下文数据时执行五天的回溯期。这会处理延迟到达的数据，并创建存留实体上下文数据的隐式时间。

Google 安全运维套件使用别名来丰富遥测数据，并使用实体图来丰富实体。检测引擎规则将合并的实体与丰富的遥测数据联接，以提供情境感知分析。

包含实体名词的事件会被视为实体。以下是一些事件类型及其对应的实体类型：

ASSET_CONTEXT 对应于 ASSET。
RESOURCE_CONTEXT 对应于 RESOURCE。
USER_CONTEXT 对应于 USER。
GROUP_CONTEXT 对应于 GROUP。

实体图使用威胁信息区分上下文数据和威胁指标 (IOC)。

使用内容丰富的数据时，请考虑以下实体图行为：

不要在实体中添加区间，而让实体图创建区间。这是因为除非另行指定，否则系统会在重复信息删除期间生成间隔。
如果指定了时间间隔，系统只会对相同的事件进行去重，并保留最近的实体。
为了确保实时规则和回溯按预期运行，实体每天必须至少提取一次。
如果实体并非每天提取，并且只在两天或更长时间内提取一次，则实时规则可能会按预期工作，但是，追溯可能丢失事件的上下文。
如果每天提取多次实体，则会将实体的重复信息删除为单个实体。
如果缺少某一天的事件数据，系统会暂时使用过去一天的数据，以确保实时规则能够正常运行。

实体图还会合并具有类似标识符的事件，以获取数据的合并视图。此合并基于以下标识符列表：

Asset
- entity.asset.product_object_id
- entity.asset.hostname
- entity.asset.asset_id
- entity.asset.mac
User
- entity.user.product_object_id
- entity.user.userid
- entity.user.windows_sid
- entity.user.email_addresses
- entity.user.employee_id
Resource
- entity.resource.product_object_id
- entity.resource.name
Group
- entity.group.product_object_id
- entity.group.email_addresses
- entity.group.windows_sid

计算发生率统计信息

Google Security Operations 会对现有数据和传入数据执行统计分析，并使用与发生率相关的指标丰富实体上下文记录。

普及率是一个数值，表示实体的热门程度。热门程度由访问工件（例如网域、文件哈希或 IP 地址）的资产数量定义。数字越大，实体越受欢迎。例如，google.com 具有较高的普遍性值，因为它被频繁访问。如果某个网域的访问频率较低，则其普遍性值会较低。较受欢迎的实体通常不太可能是恶意的。

域名、IP 和文件（哈希）支持这些丰富值。系统会计算这些值并将其存储在以下字段中。

每个实体的发生率统计信息每天都会更新。值存储在可由 Detection Engine 使用的单独实体上下文中，但不会在 Google 安全运营调查视图和 UDM 搜索中显示。

创建 Detection Engine 规则时，可以使用以下字段。

实体类型	UDM 字段
网域	`entity.domain.prevalence.day_count` `entity.domain.prevalence.day_max` `entity.domain.prevalence.day_max_sub_domains` `entity.domain.prevalence.rolling_max` `entity.domain.prevalence.rolling_max_sub_domains`
文件（哈希）	`entity.file.prevalence.day_count` `entity.file.prevalence.day_max` `entity.file.prevalence.rolling_max`
IP 地址	`entity.artifact.prevalence.day_count` `entity.artifact.prevalence.day_max` `entity.artifact.prevalence.rolling_max`

day_max 和 scroll_max 值的计算方式不同。这些字段的计算公式如下：

day_max 计算为工件在一天内的最大普遍性得分，其中一天定义为世界协调时间 (UTC) 凌晨 00:00:00 - 晚上 11:59:59。
rolling_max 是指工件在过去 10 天窗口期的最高每日发生率得分（即 day_max）。
day_count 用于计算 rolling_max，且值始终为 10。

在计算网域时，day_max 与 day_max_sub_domains（以及 rolling_max 与 rolling_max_sub_domains）之间的差异如下：

rolling_max 和 day_max 表示访问给定网域（不包括子网域）的每日唯一内部 IP 地址数量。
rolling_max_sub_domains 和 day_max_sub_domains 表示访问给定网域（包括子网域）的唯一内部 IP 地址的数量。

发生率统计信息是根据新注入的实体数据计算得出的。系统不会对先前提取的数据执行追溯式计算。计算和存储统计信息大约需要 36 个小时。

计算实体的首次看到时间和上次看到时间

Google 安全运维套件会对传入数据执行统计分析，并利用实体的首次出现和最后一次出现时间来丰富实体上下文记录。first_seen_time 字段存储首次在客户环境中发现该实体的日期和时间。last_seen_time 字段用于存储最近观察到的日期和时间。

由于多个指标（UDM 字段）可以标识某项资产或用户，因此“首次出现时间”是指任何用于识别该用户或资产的指标在客户环境中首次出现的时间。

描述资产的所有 UDM 字段都如下所示：

entity.asset.hostname
entity.asset.ip
entity.asset.mac
entity.asset.asset_id
entity.asset.product_object_id

描述用户的所有 UDM 字段均如下所示：

entity.user.windows_sid
entity.user.product_object_id
entity.user.userid
entity.user.employee_id
entity.user.email_addresses

通过“首次看到时间和上次出现时间”，分析人员能够关联在首次看到网域、文件（哈希）、资产、用户或 IP 地址后发生的特定活动，或上次发现网域、文件（哈希或 IP 地址）后停止的活动。

first_seen_time 和 last_seen_time 字段使用描述网域、IP 地址和文件（哈希）的实体进行填充。对于描述用户或资产的实体，系统只会填充 first_seen_time 字段。系统不会针对描述其他类型（例如组或资源）的实体计算这些值。

系统会为所有命名空间中的每个实体计算统计信息。 Google 安全运维套件不会计算各个命名空间内的各个实体的统计信息。目前，这些统计信息不会导出到 BigQuery 中的 Google Security Operations events 架构。

系统会计算丰富值并将其存储在以下 UDM 字段中：

实体类型	UDM 字段
网域	`entity.domain.first_seen_time` `entity.domain.last_seen_time`
文件（哈希）	`entity.file.first_seen_time` `entity.file.last_seen_time`
IP 地址	`entity.artifact.first_seen_time` `entity.artifact.last_seen_time`
资产	`entity.asset.first_seen_time`
用户	`entity.user.first_seen_time`

使用地理定位数据丰富事件

传入的日志数据可能包含没有相应位置信息的外部 IP 地址。当事件记录了不属于企业网络的设备活动的相关信息时，这很常见。例如，云服务的登录事件将根据运营商 NAT 返回的设备的外部 IP 地址，包含来源或客户端 IP 地址。

Google Security Operations 可为外部 IP 地址提供经过丰富地理位置信息的数据，实现更强大的规则检测和更丰富的调查背景。例如，Google 安全运维套件可能会使用一个外部 IP 地址，在事件中充实关于国家/地区（例如美国）、特定州（例如阿拉斯加州）的信息，以及 IP 地址所在网络（例如 ASN 和运营商名称）的相关信息。

Google 安全操作使用 Google 提供的位置数据来提供 IP 地址的大致地理位置和网络信息。您可以针对事件中的这些字段编写 Detection Engine 规则。丰富的事件数据还会导出到 BigQuery，以便在 Google 安全运营信息中心和报告中使用。

以下 IP 地址未扩充：

RFC 1918 专用 IP 地址空间，因为它们在企业网络内部。
RFC 5771 多播 IP 地址空间，因为多播地址不属于单个位置。
IPv6 唯一本地地址。
Google Cloud 服务 IP 地址。Google Cloud Compute Engine 外部 IP 地址例外，它们具有丰富的信息。

Google 安全运维团队使用地理定位数据丰富以下 UDM 字段：

principal
target
src
observer

数据类型	UDM 字段
地理位置（例如美国）	`( principal \| target \| src \| observer ).ip_geo_artifact.location.country_or_region`
州（例如纽约）	`( principal \| target \| src \| observer ).ip_geo_artifact.location.state`
经度	`( principal \| target \| src \| observer ).ip_geo_artifact.location.region_coordinates.longitude`
纬度	`( principal \| target \| src \| observer ).ip_geo_artifact.location.region_coordinates.latitude`
ASN（自治系统编号）	`( principal \| target \| src \| observer ).ip_geo_artifact.network.asn`
运营商名称	`( principal \| target \| src \| observer ).ip_geo_artifact.network.carrier_name`
DNS 域名	`( principal \| target \| src \| observer ).ip_geo_artifact.network.dns_domain`
组织名称	`( principal \| target \| src \| observer ).ip_geo_artifact.network.organization_name`

以下示例展示了将添加到 UDM 事件（其 IP 地址标记为荷兰）的地理位置信息类型：

UDM 字段	值
`principal.ip_geo_artifact.location.country_or_region`	`Netherlands`
`principal.ip_geo_artifact.location.region_coordinates.latitude`	`52.132633`
`principal.ip_geo_artifact.location.region_coordinates.longitude`	`5.291266`
`principal.ip_geo_artifact.network.asn`	`8455`
`principal.ip_geo_artifact.network.carrier_name`	`schuberg philis`

不一致

Google 专有的 IP 地理定位技术结合了网络数据及其他输入和方法，可为我们的用户提供 IP 地址位置和网络解析。其他组织可能使用不同的信号或方法，有时可能会得出不同的结果。

如果您遇到 Google 提供的 IP 地理定位结果不一致的情况，请创建客户支持请求，以便我们进行调查，并在适当情况下更正我们的记录。

使用安全浏览威胁列表中的信息丰富实体

Google 安全运维套件会从安全浏览功能中提取与文件哈希值相关的数据。每个文件的数据都作为实体进行存储，并提供关于文件的其他上下文。分析人员可以创建 Detection Engine 规则来查询此实体上下文数据，从而构建情境感知分析。

以下信息随实体上下文记录一起存储。

UDM 字段	说明
`entity.metadata.product_entity_id`	实体的唯一标识符。
`entity.metadata.entity_type`	此值为 `FILE`，表示实体描述的是文件。
`entity.metadata.collected_timestamp`	被观察到实体或发生事件的日期和时间。
`entity.metadata.interval`	存储此数据有效的开始时间和结束时间。由于威胁列表内容会随时间而变化，因此 `start_time` 和 `end_time` 反映了关于实体的数据有效的时间间隔。例如，在 `start_time and end_time.` 之间，系统观察到一个文件哈希值是恶意或可疑文件
`entity.metadata.threat.category`	我是 Google 安全运营团队 `SecurityCategory`。此属性设置为以下一个或多个值： `SOFTWARE_MALICIOUS`：表示威胁与恶意软件有关。 `SOFTWARE_PUA`：表示威胁与垃圾软件有关。
`entity.metadata.threat.severity`	我是 Google 安全运营团队 `ProductSeverity`。如果值为 `CRITICAL`，则表示该工件似乎是恶意的。如果未指定该值，则没有足够的置信度来表明相应工件是恶意的。
`entity.metadata.product_name`	存储值 `Google Safe Browsing`。
`entity.file.sha256`	文件的 SHA256 哈希值。

使用 WHOIS 数据丰富实体

Google 安全运维团队每天提取 WHOIS 数据。在提取传入的客户设备数据期间，Google 安全运维套件会根据 WHOIS 数据评估客户数据中的网域。匹配成功后，Google 安全运营团队会将相关的 WHOIS 数据与域名的实体记录一起存储。对于每个实体（其中 entity.metadata.entity_type = DOMAIN_NAME），Google 安全运营团队会使用 WHOIS 中的信息来丰富该实体。

Google 安全运营团队会将丰富的 WHOIS 数据填充到实体记录中的以下字段中：

entity.domain.admin.attribute.labels
entity.domain.audit_update_time
entity.domain.billing.attribute.labels
entity.domain.billing.office_address.country_or_region
entity.domain.contact_email
entity.domain.creation_time
entity.domain.expiration_time
entity.domain.iana_registrar_id
entity.domain.name_server
entity.domain.private_registration
entity.domain.registrant.company_name
entity.domain.registrant.office_address.state
entity.domain.registrant.office_address.country_or_region
entity.domain.registrant.email_addresses
entity.domain.registrant.user_display_name
entity.domain.registrar
entity.domain.registry_data_raw_text
entity.domain.status
entity.domain.tech.attribute.labels
entity.domain.update_time
entity.domain.whois_record_raw_text
entity.domain.whois_server
entity.domain.zone

如需了解这些字段的说明，请参阅统一数据模型字段列表文档。

注入和存储 Google Cloud Threat Intelligence 数据

Google Security Operations 会从 Google Cloud 威胁情报 (GCTI) 数据源提取数据，为您提供上下文信息，供您在调查环境中的活动时使用。您可以查询以下数据源：

GCTI Tor 退出节点：称为“Tor”退出节点的 IP 地址。
GCTI 良性二进制文件：属于操作系统原始发行版的一部分或通过官方操作系统补丁更新的文件。此数据源排除了对手因攻击者在外地攻击中常见的活动而滥用的一些官方操作系统二进制文件，例如专注于初始输入向量的二进制文件。
GCTI 远程访问工具：恶意操作者经常使用的文件。这些工具通常是合法应用，有时会被滥用以远程连接到遭到入侵的系统。

这些上下文数据作为实体在全局范围内存储。您可以使用检测引擎规则查询数据。在规则中添加以下 UDM 字段和值，以查询这些全局实体：
graph.metadata.vendor_name = Google Cloud Threat Intelligence
graph.metadata.product_name = GCTI Feed

在本文档中，占位符 <variable_name> 表示规则中用于标识 UDM 记录的唯一变量名称。

定时与永恒的 Google Cloud 威胁情报数据源

Google Cloud 威胁情报数据源可以是定时，也可以是不计时。

计时数据源具有与每个条目关联的时间范围。这意味着，如果在第 1 天生成检测，那么以后的任何一天，在追溯寻找期间，预计都会针对第 1 天生成相同的检测。

不重复的数据源没有关联的时间范围。这是因为只应考虑最新的数据集。永恒数据源常用于预计不会发生变化的文件哈希等数据。如果第 1 天未生成任何检测，则在第 2 天，由于添加了新条目，系统可能会在怀旧搜寻期间针对第 1 天生成检测。

有关 Tor 退出节点 IP 地址的数据

Google 安全运维套件会提取并存储已知 Tor 退出节点的 IP 地址。Tor 退出节点是流量退出 Tor 网络的点。从此数据源提取的信息存储在以下 UDM 字段中。此来源中的数据是计时数据。

UDM 字段	说明
`<variable_name>.graph.metadata.vendor_name`	存储值 `Google Cloud Threat Intelligence`。
`<variable_name>.graph.metadata.product_name`	存储值 `GCTI Feed`。
`<variable_name>.graph.metadata.threat.threat_feed_name`	存储值 `Tor Exit Nodes`。
`<variable_name>.graph.entity.artifact.ip`	存储从 GCTI 数据源注入的 IP 地址。

良性操作系统文件的相关数据

Google 安全运营团队会从 GCTI 良性二进制文件数据源提取和存储文件哈希值。从此数据源提取的信息存储在以下 UDM 字段中。此来源中的数据是永恒的。

UDM 字段	说明
`<variable_name>.graph.metadata.vendor_name`	存储值 `Google Cloud Threat Intelligence`。
`<variable_name>.graph.metadata.product_name`	存储值 `GCTI Feed`。
`<variable_name>.graph.metadata.threat.threat_feed_name`	存储值 `Benign Binaries`。
`<variable_name>.graph.entity.file.sha256`	存储文件的 SHA256 哈希值。
`<variable_name>.graph.entity.file.sha1`	存储文件的 SHA1 哈希值。
`<variable_name>.graph.entity.file.md5`	存储文件的 MD5 哈希值。

远程访问工具的相关数据

远程访问工具包括已知远程访问工具（例如恶意操作者经常使用的 VNC 客户端）的文件哈希值。这些工具通常是合法应用，有时会被滥用以远程连接到受损系统。从此数据源提取的信息存储在以下 UDM 字段中。此来源中的数据是永恒的。

UDM 字段	说明
.graph.metadata.vendor_name	存储值 `Google Cloud Threat Intelligence`。
.graph.metadata.product_name	存储值 `GCTI Feed`。
.graph.metadata.threat.threat_feed_name	存储值 `Remote Access Tools`。
.graph.entity.file.sha256	存储文件的 SHA256 哈希值。
.graph.entity.file.sha1	存储文件的 SHA1 哈希值。
.graph.entity.file.md5	存储文件的 MD5 哈希值。

使用 VirusTotal 文件元数据丰富事件

Google Security Operations 将文件哈希丰富为 UDM 事件，并在调查期间提供其他背景信息。在客户环境中，通过哈希别名来丰富 UDM 事件。哈希别名整合了所有类型的文件哈希值，并会在搜索期间提供有关文件哈希值的信息。

VirusTotal 文件元数据和关系扩充与 Google SecOps 的集成可用于识别恶意活动的模式并跟踪网络中的恶意软件移动。

原始日志仅提供有关文件的有限信息。VirusTotal 会使用文件元数据丰富事件，以提供不良哈希转储以及关于不良文件的元数据。元数据包括文件名、类型、导入的函数和标记等信息。您可以在 UDM 搜索和检测引擎中与 YARA-L 结合使用此信息，以便了解不良文件事件，以及总体上在威胁搜寻期间。例如，检测对原始文件的任何修改，进而导入文件元数据以进行威胁检测。

以下信息随记录一起存储。如需查看所有 UDM 字段的列表，请参阅统一数据模型字段列表。

数据类型	UDM 字段
SHA-256	`( principal \| target \| src \| observer ).file.sha256`
MD5	`( principal \| target \| src \| observer ).file.md5`
SHA-1	`( principal \| target \| src \| observer ).file.sha1`
大小	`( principal \| target \| src \| observer ).file.size`
Ssdeep	`( principal \| target \| src \| observer ).file.ssdeep`
哈希	`( principal \| target \| src \| observer ).file.vhash`
身份验证	`( principal \| target \| src \| observer ).file.authentihash`
文件类型	`( principal \| target \| src \| observer ).file.file_type`
标记	`( principal \| target \| src \| observer ).file.tags`
功能标记	`( principal \| target \| src \| observer ).file.capabilities_tags`
名称	`( principal \| target \| src \| observer ).file.names`
首次观看时间	`( principal \| target \| src \| observer ).file.first_seen_time`
上次上线时间	`( principal \| target \| src \| observer ).file.last_seen_time`
上次修改时间	`( principal \| target \| src \| observer ).file.last_modification_time`
上次分析时间	`( principal \| target \| src \| observer ).file.last_analysis_time`
嵌入式网址	`( principal \| target \| src \| observer ).file.embedded_urls`
嵌入式 IP	`( principal \| target \| src \| observer ).file.embedded_ips`
嵌入式网域	`( principal \| target \| src \| observer ).file.embedded_domains`
签名信息	`( principal \| target \| src \| observer ).file.signature_info`
签名信息签名	`( principal \| target \| src \| observer).file.signature_info.sigcheck`
签名信息签名验证消息	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.verification_message`
签名信息签名已验证	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.verified`
签名信息签名签名者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers`
签名信息签名签名者名称	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.name`
签名信息签名签名者状态	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.status`
签名信息签名签名者证书的有效用法	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.valid_usage`
签名信息签名签名者证书颁发者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.cert_issuer`
签名信息签名 X509	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509`
签名信息签名 X509 名称	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.name`
签名信息签名 X509 算法	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.algorithm`
签名信息签名 X509 指纹	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.thumprint`
签名信息签名 X509 证书颁发者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.cert_issuer`
签名信息签名 X509 序列号	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.serial_number`
签名信息代码签名	`( principal \| target \| src \| observer ).file.signature_info.codesign`
签名信息代码签名 ID	`( principal \| target \| src \| observer ).file.signature_info.codesign.id`
签名信息代码签名形式	`( principal \| target \| src \| observer ).file.signature_info.codesign.format`
签名信息代码签名编译时间	`( principal \| target \| src \| observer ).file.signature_info.codesign.compilation_time`
Exiftool 信息	`( principal \| target \| src \| observer ).file.exif_info`
Exiftool 信息原始文件名	`( principal \| target \| src \| observer ).file.exif_info.original_file`
Exiftool 信息产品名称	`( principal \| target \| src \| observer ).file.exif_info.product`
Exiftool 信息公司名称	`( principal \| target \| src \| observer ).file.exif_info.company`
Exiftool 信息文件说明	`( principal \| target \| src \| observer ).file.exif_info.file_description`
Exiftool 信息入口点	`( principal \| target \| src \| observer ).file.exif_info.entry_point`
Exiftool 信息编译时间	`( principal \| target \| src \| observer ).file.exif_info.compilation_time`
PDF 信息	`( principal \| target \| src \| observer ).file.pdf_info`
PDF 信息 /JS 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.js`
PDF 信息 /JavaScript 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.javascript`
PDF 信息 /启动代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.launch_action_count`
PDF 信息对象流数量	`( principal \| target \| src \| observer ).file.pdf_info.object_stream_count`
PDF 信息对象定义数量（endobj 关键字）	`( principal \| target \| src \| observer ).file.pdf_info.endobj_count`
PDF 信息 PDF 版本	`( principal \| target \| src \| observer ).file.pdf_info.header`
PDF 信息 /AcroForm 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.acroform`
PDF 信息 /AA 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.autoaction`
PDF 信息 /EmbeddedFile 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.embedded_file`
PDF 信息 /加密代码	`( principal \| target \| src \| observer ).file.pdf_info.encrypted`
PDF 信息 /RichMedia 标记数	`( principal \| target \| src \| observer ).file.pdf_info.flash`
PDF 信息 /JBIG2Decode 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.jbig2_compression`
PDF 信息对象定义数量（obj 关键字）	`( principal \| target \| src \| observer ).file.pdf_info.obj_count`
PDF 信息定义的流对象（信息流关键字）的数量	`( principal \| target \| src \| observer ).file.pdf_info.endstream_count`
PDF 信息 PDF 中的页数	`( principal \| target \| src \| observer ).file.pdf_info.page_count`
PDF 信息定义的流对象（信息流关键字）的数量	`( principal \| target \| src \| observer ).file.pdf_info.stream_count`
PDF 信息 /OpenAction 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.openaction`
PDF 信息 startxref 关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.startxref`
PDF 信息以超过 3 个字节表示的颜色数量 (CVE-2009-3459)	`( principal \| target \| src \| observer ).file.pdf_info.suspicious_colors`
PDF 信息预告片关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.trailer`
PDF 信息找到的 /XFA 标记数	`( principal \| target \| src \| observer ).file.pdf_info.xfa`
PDF 信息外部引用关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.xref`
PE 文件元数据	`( principal \| target \| src \| observer ).file.pe_file`
PE 文件元数据因帕什	`( principal \| target \| src \| observer ).file.pe_file.imphash`
PE 文件元数据入口点	`( principal \| target \| src \| observer ).file.pe_file.entry_point`
PE 文件元数据入口点 exiftool	`( principal \| target \| src \| observer ).file.pe_file.entry_point_exiftool`
PE 文件元数据编译时间	`( principal \| target \| src \| observer ).file.pe_file.compilation_time`
PE 文件元数据编译 exiftool 时间	`( principal \| target \| src \| observer ).file.pe_file.compilation_exiftool_time`
PE 文件元数据版块	`( principal \| target \| src \| observer ).file.pe_file.section`
PE 文件元数据版块名称	`( principal \| target \| src \| observer ).file.pe_file.section.name`
PE 文件元数据版块熵	`( principal \| target \| src \| observer ).file.pe_file.section.entropy`
PE 文件元数据版块原始大小（以字节为单位）	`( principal \| target \| src \| observer ).file.pe_file.section.raw_size_bytes`
PE 文件元数据版块虚拟大小（以字节为单位）	`( principal \| target \| src \| observer ).file.pe_file.section.virtual_size_bytes`
PE 文件元数据版块 MD5 十六进制	`( principal \| target \| src \| observer ).file.pe_file.section.md5_hex`
PE 文件元数据导入	`( principal \| target \| src \| observer ).file.pe_file.imports`
PE 文件元数据导入库	`( principal \| target \| src \| observer ).file.pe_file.imports.library`
PE 文件元数据导入 Functions	`( principal \| target \| src \| observer ).file.pe_file.imports.functions`
PE 文件元数据资源信息	`( principal \| target \| src \| observer ).file.pe_file.resource`
PE 文件元数据资源信息 SHA-256 十六进制	`( principal \| target \| src \| observer ).file.pe_file.resource.sha256_hex`
PE 文件元数据资源信息由魔法 Python 模块标识的资源类型	`( principal \| target \| src \| observer ).file.pe_file.resource.filetype_magic`
PE 文件元数据资源信息 Windows PE 规范中所定义的人类可读版本的语言和子语言标识符	`( principal \| target \| src \| observer ).file.pe_file.resource_language_code`
PE 文件元数据资源信息熵	`( principal \| target \| src \| observer ).file.pe_file.resource.entropy`
PE 文件元数据资源信息文件类型	`( principal \| target \| src \| observer ).file.pe_file.resource.file_type`
PE 文件元数据按资源类型划分的资源数量	`( principal \| target \| src \| observer ).file.pe_file.resources_type_count_str`
PE 文件元数据不同语言的资源数量	`( principal \| target \| src \| observer ).file.pe_file.resources_language_count_str`

使用 VirusTotal 关系数据丰富实体

VirusTotal 可帮助分析可疑文件、网域、IP 地址和网址，以检测恶意软件和其他漏洞，并与安全社区分享发现结果。Google 安全运营团队会从 VirusTotal 相关连接中提取数据。此类数据以实体的形式存储，并提供有关文件哈希值与文件、网域、IP 地址和网址之间的关系的信息。

分析人员可以使用此数据，根据来自其他来源的网址或网域信息，确定文件哈希值是否为不良文件。这些信息可用于创建用于查询实体上下文数据的 Detection Engine 规则，以构建情境感知分析。

这些数据仅在部分 VirusTotal 和 Google 安全运营许可中提供。请与您的客户经理联系，检查您的使用权。

以下信息随实体上下文记录一起存储：

UDM 字段	说明
`entity.metadata.product_entity_id`	实体的唯一标识符
`entity.metadata.entity_type`	存储值 `FILE`，表示实体描述的是文件
`entity.metadata.interval`	`start_time` 表示开始时间，`end_time` 表示数据有效的结束时间
`entity.metadata.source_labels`	此字段用于存储此实体的 `source_id` 和 `target_id` 键值对列表。`source_id` 是文件哈希值，`target_id` 可以是与此文件相关的网址、域名或 IP 地址的哈希值或值。您可以在 virustotal.com 中搜索网址、域名、IP 地址或文件。
`entity.metadata.product_name`	存储值“VirusTotal Relationships”
`entity.metadata.vendor_name`	存储值“VirusTotal”
`entity.file.sha256`	存储文件的 SHA-256 哈希值
`entity.file.relations`	与父文件实体相关的子实体的列表
`entity.relations.relationship`	此字段说明了父实体与子实体之间的关系类型。该值可以是 `EXECUTES`、`DOWNLOADED_FROM` 或 `CONTACTS`。
`entity.relations.direction`	存储值“UNIDIRECTIONAL”，并指示与子实体的关系方向
`entity.relations.entity.url`	父实体中的文件联系的网址（如果父实体与网址之间的关系为 `CONTACTS`）或下载父实体中的文件的网址（如果父实体与该网址之间的关系为 `DOWNLOADED_FROM`）。
`entity.relations.entity.ip`	父实体中的文件获取或从中下载的 IP 地址列表，仅包含一个 IP 地址。
`entity.relations.entity.domain.name`	父实体中的文件与之联系或从中下载文件的域名
`entity.relations.entity.file.sha256`	存储相关文件的 SHA-256 哈希值
`entity.relations.entity_type`	此字段包含关系中实体的类型。该值可以是 `URL`、`DOMAIN_NAME`、`IP_ADDRESS` 或 `FILE`。系统会根据 `entity_type` 填充这些字段。例如，如果 `entity_type` 为 `URL`，则会填充 `entity.relations.entity.url`。

后续步骤

如需了解如何将丰富数据与其他 Google 安全运维功能结合使用，请参阅以下内容：

数据类型	UDM 字段
SHA-256	`( principal \| target \| src \| observer ).file.sha256`
MD5	`( principal \| target \| src \| observer ).file.md5`
SHA-1	`( principal \| target \| src \| observer ).file.sha1`
大小	`( principal \| target \| src \| observer ).file.size`
Ssdeep	`( principal \| target \| src \| observer ).file.ssdeep`
哈希	`( principal \| target \| src \| observer ).file.vhash`
身份验证	`( principal \| target \| src \| observer ).file.authentihash`
文件类型	`( principal \| target \| src \| observer ).file.file_type`
标记	`( principal \| target \| src \| observer ).file.tags`
功能标记	`( principal \| target \| src \| observer ).file.capabilities_tags`
名称	`( principal \| target \| src \| observer ).file.names`
首次观看时间	`( principal \| target \| src \| observer ).file.first_seen_time`
上次上线时间	`( principal \| target \| src \| observer ).file.last_seen_time`
上次修改时间	`( principal \| target \| src \| observer ).file.last_modification_time`
上次分析时间	`( principal \| target \| src \| observer ).file.last_analysis_time`
嵌入式网址	`( principal \| target \| src \| observer ).file.embedded_urls`
嵌入式 IP	`( principal \| target \| src \| observer ).file.embedded_ips`
嵌入式网域	`( principal \| target \| src \| observer ).file.embedded_domains`
签名信息	`( principal \| target \| src \| observer ).file.signature_info`
签名信息签名	`( principal \| target \| src \| observer).file.signature_info.sigcheck`
签名信息签名验证消息	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.verification_message`
签名信息签名已验证	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.verified`
签名信息签名签名者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers`
签名信息签名签名者名称	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.name`
签名信息签名签名者状态	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.status`
签名信息签名签名者证书的有效用法	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.valid_usage`
签名信息签名签名者证书颁发者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.signers.cert_issuer`
签名信息签名 X509	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509`
签名信息签名 X509 名称	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.name`
签名信息签名 X509 算法	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.algorithm`
签名信息签名 X509 指纹	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.thumprint`
签名信息签名 X509 证书颁发者	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.cert_issuer`
签名信息签名 X509 序列号	`( principal \| target \| src \| observer ).file.signature_info.sigcheck.x509.serial_number`
签名信息代码签名	`( principal \| target \| src \| observer ).file.signature_info.codesign`
签名信息代码签名 ID	`( principal \| target \| src \| observer ).file.signature_info.codesign.id`
签名信息代码签名形式	`( principal \| target \| src \| observer ).file.signature_info.codesign.format`
签名信息代码签名编译时间	`( principal \| target \| src \| observer ).file.signature_info.codesign.compilation_time`
Exiftool 信息	`( principal \| target \| src \| observer ).file.exif_info`
Exiftool 信息原始文件名	`( principal \| target \| src \| observer ).file.exif_info.original_file`
Exiftool 信息产品名称	`( principal \| target \| src \| observer ).file.exif_info.product`
Exiftool 信息公司名称	`( principal \| target \| src \| observer ).file.exif_info.company`
Exiftool 信息文件说明	`( principal \| target \| src \| observer ).file.exif_info.file_description`
Exiftool 信息入口点	`( principal \| target \| src \| observer ).file.exif_info.entry_point`
Exiftool 信息编译时间	`( principal \| target \| src \| observer ).file.exif_info.compilation_time`
PDF 信息	`( principal \| target \| src \| observer ).file.pdf_info`
PDF 信息 /JS 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.js`
PDF 信息 /JavaScript 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.javascript`
PDF 信息 /启动代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.launch_action_count`
PDF 信息对象流数量	`( principal \| target \| src \| observer ).file.pdf_info.object_stream_count`
PDF 信息对象定义数量（endobj 关键字）	`( principal \| target \| src \| observer ).file.pdf_info.endobj_count`
PDF 信息 PDF 版本	`( principal \| target \| src \| observer ).file.pdf_info.header`
PDF 信息 /AcroForm 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.acroform`
PDF 信息 /AA 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.autoaction`
PDF 信息 /EmbeddedFile 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.embedded_file`
PDF 信息 /加密代码	`( principal \| target \| src \| observer ).file.pdf_info.encrypted`
PDF 信息 /RichMedia 标记数	`( principal \| target \| src \| observer ).file.pdf_info.flash`
PDF 信息 /JBIG2Decode 标记的数量	`( principal \| target \| src \| observer ).file.pdf_info.jbig2_compression`
PDF 信息对象定义数量（obj 关键字）	`( principal \| target \| src \| observer ).file.pdf_info.obj_count`
PDF 信息定义的流对象（信息流关键字）的数量	`( principal \| target \| src \| observer ).file.pdf_info.endstream_count`
PDF 信息 PDF 中的页数	`( principal \| target \| src \| observer ).file.pdf_info.page_count`
PDF 信息定义的流对象（信息流关键字）的数量	`( principal \| target \| src \| observer ).file.pdf_info.stream_count`
PDF 信息 /OpenAction 代码的数量	`( principal \| target \| src \| observer ).file.pdf_info.openaction`
PDF 信息 startxref 关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.startxref`
PDF 信息以超过 3 个字节表示的颜色数量 (CVE-2009-3459)	`( principal \| target \| src \| observer ).file.pdf_info.suspicious_colors`
PDF 信息预告片关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.trailer`
PDF 信息找到的 /XFA 标记数	`( principal \| target \| src \| observer ).file.pdf_info.xfa`
PDF 信息外部引用关键字数量	`( principal \| target \| src \| observer ).file.pdf_info.xref`
PE 文件元数据	`( principal \| target \| src \| observer ).file.pe_file`
PE 文件元数据因帕什	`( principal \| target \| src \| observer ).file.pe_file.imphash`
PE 文件元数据入口点	`( principal \| target \| src \| observer ).file.pe_file.entry_point`
PE 文件元数据入口点 exiftool	`( principal \| target \| src \| observer ).file.pe_file.entry_point_exiftool`
PE 文件元数据编译时间	`( principal \| target \| src \| observer ).file.pe_file.compilation_time`
PE 文件元数据编译 exiftool 时间	`( principal \| target \| src \| observer ).file.pe_file.compilation_exiftool_time`
PE 文件元数据版块	`( principal \| target \| src \| observer ).file.pe_file.section`
PE 文件元数据版块名称	`( principal \| target \| src \| observer ).file.pe_file.section.name`
PE 文件元数据版块熵	`( principal \| target \| src \| observer ).file.pe_file.section.entropy`
PE 文件元数据版块原始大小（以字节为单位）	`( principal \| target \| src \| observer ).file.pe_file.section.raw_size_bytes`
PE 文件元数据版块虚拟大小（以字节为单位）	`( principal \| target \| src \| observer ).file.pe_file.section.virtual_size_bytes`
PE 文件元数据版块 MD5 十六进制	`( principal \| target \| src \| observer ).file.pe_file.section.md5_hex`
PE 文件元数据导入	`( principal \| target \| src \| observer ).file.pe_file.imports`
PE 文件元数据导入库	`( principal \| target \| src \| observer ).file.pe_file.imports.library`
PE 文件元数据导入 Functions	`( principal \| target \| src \| observer ).file.pe_file.imports.functions`
PE 文件元数据资源信息	`( principal \| target \| src \| observer ).file.pe_file.resource`
PE 文件元数据资源信息 SHA-256 十六进制	`( principal \| target \| src \| observer ).file.pe_file.resource.sha256_hex`
PE 文件元数据资源信息由魔法 Python 模块标识的资源类型	`( principal \| target \| src \| observer ).file.pe_file.resource.filetype_magic`
PE 文件元数据资源信息 Windows PE 规范中所定义的人类可读版本的语言和子语言标识符	`( principal \| target \| src \| observer ).file.pe_file.resource_language_code`
PE 文件元数据资源信息熵	`( principal \| target \| src \| observer ).file.pe_file.resource.entropy`
PE 文件元数据资源信息文件类型	`( principal \| target \| src \| observer ).file.pe_file.resource.file_type`
PE 文件元数据按资源类型划分的资源数量	`( principal \| target \| src \| observer ).file.pe_file.resources_type_count_str`
PE 文件元数据不同语言的资源数量	`( principal \| target \| src \| observer ).file.pe_file.resources_language_count_str`