You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

5606 lines
385 KiB

<html><body>
<style>
body, h1, h2, h3, div, span, p, pre, a {
margin: 0;
padding: 0;
border: 0;
font-weight: inherit;
font-style: inherit;
font-size: 100%;
font-family: inherit;
vertical-align: baseline;
}
body {
font-size: 13px;
padding: 1em;
}
h1 {
font-size: 26px;
margin-bottom: 1em;
}
h2 {
font-size: 24px;
margin-bottom: 1em;
}
h3 {
font-size: 20px;
margin-bottom: 1em;
margin-top: 1em;
}
pre, code {
line-height: 1.5;
font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace;
}
pre {
margin-top: 0.5em;
}
h1, h2, h3, p {
font-family: Arial, sans serif;
}
h1, h2, h3 {
border-bottom: solid #CCC 1px;
}
.toc_element {
margin-top: 0.5em;
}
.firstline {
margin-left: 2 em;
}
.method {
margin-top: 1em;
border: solid 1px #CCC;
padding: 1em;
background: #EEE;
}
.details {
font-weight: bold;
font-size: 14px;
}
</style>
<h1><a href="dlp_v2.html">Cloud Data Loss Prevention (DLP) API</a> . <a href="dlp_v2.projects.html">projects</a> . <a href="dlp_v2.projects.jobTriggers.html">jobTriggers</a></h1>
<h2>Instance Methods</h2>
<p class="toc_element">
<code><a href="#activate">activate(name, body=None, x__xgafv=None)</a></code></p>
<p class="firstline">Activate a job trigger. Causes the immediate execute of a trigger</p>
<p class="toc_element">
<code><a href="#create">create(parent, body, x__xgafv=None)</a></code></p>
<p class="firstline">Creates a job trigger to run DLP actions such as scanning storage for</p>
<p class="toc_element">
<code><a href="#delete">delete(name, x__xgafv=None)</a></code></p>
<p class="firstline">Deletes a job trigger.</p>
<p class="toc_element">
<code><a href="#get">get(name, x__xgafv=None)</a></code></p>
<p class="firstline">Gets a job trigger.</p>
<p class="toc_element">
<code><a href="#list">list(parent, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</a></code></p>
<p class="firstline">Lists job triggers.</p>
<p class="toc_element">
<code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p>
<p class="firstline">Retrieves the next page of results.</p>
<p class="toc_element">
<code><a href="#patch">patch(name, body, x__xgafv=None)</a></code></p>
<p class="firstline">Updates a job trigger.</p>
<h3>Method Details</h3>
<div class="method">
<code class="details" id="activate">activate(name, body=None, x__xgafv=None)</code>
<pre>Activate a job trigger. Causes the immediate execute of a trigger
instead of waiting on the trigger event to occur.
Args:
name: string, Resource name of the trigger to activate, for example
`projects/dlp-test-project/jobTriggers/53234423`. (required)
body: object, The request body.
The object takes the form of:
{ # Request message for ActivateJobTrigger.
}
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Combines all of the information about a DLP job.
"errors": [ # A stream of errors encountered running the job.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"name": "A String", # The server-assigned name.
"inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source.
"requestedOptions": { # The configuration used for this job.
"snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of
# this run.
# to be detected) to be used anywhere you otherwise would normally specify
# InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates
# to learn more.
"updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field.
"displayName": "A String", # Display name (max 256 chars).
"description": "A String", # Short description (max 256 chars).
"inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"createTime": "A String", # The creation timestamp of a inspectTemplate, output only field.
"name": "A String", # The template name. Output only.
#
# The template will have one of the following formats:
# `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR
# `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID`
},
"jobConfig": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
},
"result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job.
"infoTypeStats": [ # Statistics of how many instances of each info type were found during
# inspect job.
{ # Statistics regarding a specific InfoType.
"count": "A String", # Number of findings for this infoType.
"infoType": { # Type of information detected by the API. # The type of finding this stat is for.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
},
],
"totalEstimatedBytes": "A String", # Estimate of the number of bytes to process.
"processedBytes": "A String", # Total size in bytes that were processed.
},
},
"riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source.
"numericalStatsResult": { # Result of the numerical stats computation.
"quantileValues": [ # List of 99 values that partition the set of field values into 100 equal
# sized buckets.
{ # Set of primitive values supported by the system.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
],
"maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
"minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
},
"kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an
# estimation, not exact values.
"kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value
# doesn't correspond to any such interval, the associated frequency is
# zero. For example, the following records:
# {min_anonymity: 1, max_anonymity: 1, frequency: 17}
# {min_anonymity: 2, max_anonymity: 3, frequency: 42}
# {min_anonymity: 5, max_anonymity: 10, frequency: 99}
# mean that there are no record with an estimated anonymity of 4, 5, or
# larger than 10.
{ # A KMapEstimationHistogramBucket message with the following values:
# min_anonymity: 3
# max_anonymity: 5
# frequency: 42
# means that there are 42 records whose quasi-identifier values correspond
# to 3, 4 or 5 people in the overlying population. An important particular
# case is when min_anonymity = max_anonymity = 1: the frequency field then
# corresponds to the number of uniquely identifiable records.
"bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
# number of classes returned per bucket is capped at 20.
{ # A tuple of values for the quasi-identifier columns.
"estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values.
"quasiIdsValues": [ # The quasi-identifier values.
{ # Set of primitive values supported by the system.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
],
},
],
"minAnonymity": "A String", # Always positive.
"bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
"maxAnonymity": "A String", # Always greater than or equal to min_anonymity.
"bucketSize": "A String", # Number of records within these anonymity bounds.
},
],
},
"kAnonymityResult": { # Result of the k-anonymity computation.
"equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes.
{
"bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
# classes returned per bucket is capped at 20.
{ # The set of columns' values that share the same ldiversity value
"quasiIdsValues": [ # Set of values defining the equivalence class. One value per
# quasi-identifier column in the original KAnonymity metric message.
# The order is always the same as the original request.
{ # Set of primitive values supported by the system.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
],
"equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the
# above set of values.
},
],
"bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
"equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket.
"equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket.
"bucketSize": "A String", # Total number of equivalence classes in this bucket.
},
],
},
"lDiversityResult": { # Result of the l-diversity computation.
"sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies.
{
"bucketValues": [ # Sample of equivalence classes in this bucket. The total number of
# classes returned per bucket is capped at 20.
{ # The set of columns' values that share the same ldiversity value.
"numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class.
"quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence
# class. The order is always the same as the original request.
{ # Set of primitive values supported by the system.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
],
"topSensitiveValues": [ # Estimated frequencies of top sensitive values.
{ # A value of a field, including its frequency.
"count": "A String", # How many times the value is contained in the field.
"value": { # Set of primitive values supported by the system. # A value contained in the field in question.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
},
],
"equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class.
},
],
"bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket.
"bucketSize": "A String", # Total number of equivalence classes in this bucket.
"sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence
# classes in this bucket.
"sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence
# classes in this bucket.
},
],
},
"requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute.
"numericalStatsConfig": { # Compute numerical stats over an individual column, including
# min, max, and quantiles.
"field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are
# integer, float, date, datetime, timestamp, time.
"name": "A String", # Name describing the field.
},
},
"kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what
# is called "journalist risk" in the literature, except the attack dataset is
# statistically modeled instead of being perfectly known. This can be done
# using publicly available data (like the US Census), or using a custom
# statistical model (indicated as one or several BigQuery tables), or by
# extrapolating from the distribution of values in the input dataset.
# A column with a semantic tag attached.
"regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
# Required if no column is tagged with a region-specific InfoType (like
# US_ZIP_5) or a region code.
"quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the
# same tag. [required]
{
"field": { # General identifier of a data field in a storage service. # Identifies the column. [required]
"name": "A String", # Name describing the field.
},
"customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
# indicate an auxiliary table that contains statistical information on
# the possible values of this column (below).
"infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
# dataset as a statistical model of population, if available. We
# currently support US ZIP codes, region codes, ages and genders.
# To programmatically obtain the list of supported InfoTypes, use
# ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
# the distribution of values in the input data
# empty messages in your APIs. A typical example is to use it as the request
# or the response type of an API method. For instance:
#
# service Foo {
# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
# }
#
# The JSON representation for `Empty` is empty JSON object `{}`.
},
},
],
"auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
# used to tag a quasi-identifiers column must appear in exactly one column
# of one auxiliary table.
{ # An auxiliary table contains statistical information on the relative
# frequency of different quasi-identifiers values. It has one or several
# quasi-identifiers columns, and one column that indicates the relative
# frequency of each quasi-identifier tuple.
# If a tuple is present in the data but not in the auxiliary table, the
# corresponding relative frequency is assumed to be zero (and thus, the
# tuple is highly reidentifiable).
"relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number
# between 0 and 1 (inclusive). Null values are assumed to be zero.
# [required]
"name": "A String", # Name describing the field.
},
"quasiIds": [ # Quasi-identifier columns. [required]
{ # A quasi-identifier column has a custom_tag, used to know which column
# in the data corresponds to which column in the statistical model.
"field": { # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
"customTag": "A String",
},
],
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required]
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
],
},
"lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk.
"sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value.
"name": "A String", # Name describing the field.
},
"quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are
# defined for the l-diversity computation. When multiple fields are
# specified, they are considered a single composite key.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
},
"deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to
# figure out that one given individual appears in a de-identified dataset.
# Similarly to the k-map metric, we cannot compute δ-presence exactly without
# knowing the attack dataset, so we use a statistical model instead.
"regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling.
# Required if no column is tagged with a region-specific InfoType (like
# US_ZIP_5) or a region code.
"quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the
# same tag. [required]
{ # A column with a semantic tag attached.
"field": { # General identifier of a data field in a storage service. # Identifies the column. [required]
"name": "A String", # Name describing the field.
},
"customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must
# indicate an auxiliary table that contains statistical information on
# the possible values of this column (below).
"infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public
# dataset as a statistical model of population, if available. We
# currently support US ZIP codes, region codes, ages and genders.
# To programmatically obtain the list of supported InfoTypes, use
# ListInfoTypes with the supported_by=RISK_ANALYSIS filter.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from
# the distribution of values in the input data
# empty messages in your APIs. A typical example is to use it as the request
# or the response type of an API method. For instance:
#
# service Foo {
# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
# }
#
# The JSON representation for `Empty` is empty JSON object `{}`.
},
},
],
"auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag
# used to tag a quasi-identifiers field must appear in exactly one
# field of one auxiliary table.
{ # An auxiliary table containing statistical information on the relative
# frequency of different quasi-identifiers values. It has one or several
# quasi-identifiers columns, and one column that indicates the relative
# frequency of each quasi-identifier tuple.
# If a tuple is present in the data but not in the auxiliary table, the
# corresponding relative frequency is assumed to be zero (and thus, the
# tuple is highly reidentifiable).
"relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number
# between 0 and 1 (inclusive). Null values are assumed to be zero.
# [required]
"name": "A String", # Name describing the field.
},
"quasiIds": [ # Quasi-identifier columns. [required]
{ # A quasi-identifier column has a custom_tag, used to know which column
# in the data corresponds to which column in the statistical model.
"field": { # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
"customTag": "A String",
},
],
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required]
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
],
},
"categoricalStatsConfig": { # Compute numerical stats over an individual column, including
# number of distinct values and value count distribution.
"field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are
# supported except for arrays and structs. However, it may be more
# informative to use NumericalStats when the field type is supported,
# depending on the data.
"name": "A String", # Name describing the field.
},
},
"kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk.
"entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a
# single individual. If the same entity_id is associated to multiple
# quasi-identifier tuples over distinct rows, we consider the entire
# collection of tuples as the composite quasi-identifier. This collection
# is a multiset: the order in which the different tuples appear in the
# dataset is ignored, but their frequency is taken into account.
#
# Important note: a maximum of 1000 rows can be associated to a single
# entity ID. If more rows are associated with the same entity ID, some
# might be ignored.
# single person. For example, in medical records the `EntityId` might be a
# patient identifier, or for financial records it might be an account
# identifier. This message is used when generalizations or analysis must take
# into account that multiple rows correspond to the same entity.
"field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier.
"name": "A String", # Name describing the field.
},
},
"quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are
# specified, they are considered a single composite key. Structs and
# repeated data types are not supported; however, nested fields are
# supported so long as they are not structs themselves or nested within
# a repeated field.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
},
},
"categoricalStatsResult": { # Result of the categorical stats computation.
"valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column.
{
"bucketValues": [ # Sample of value frequencies in this bucket. The total number of
# values returned per bucket is capped at 20.
{ # A value of a field, including its frequency.
"count": "A String", # How many times the value is contained in the field.
"value": { # Set of primitive values supported by the system. # A value contained in the field in question.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
},
],
"bucketValueCount": "A String", # Total number of distinct values in this bucket.
"valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket.
"valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket.
"bucketSize": "A String", # Total number of values in this bucket.
},
],
},
"deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an
# estimation, not exact values.
"deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a
# value doesn't correspond to any such interval, the associated frequency
# is zero. For example, the following records:
# {min_probability: 0, max_probability: 0.1, frequency: 17}
# {min_probability: 0.2, max_probability: 0.3, frequency: 42}
# {min_probability: 0.3, max_probability: 0.4, frequency: 99}
# mean that there are no record with an estimated probability in [0.1, 0.2)
# nor larger or equal to 0.4.
{ # A DeltaPresenceEstimationHistogramBucket message with the following
# values:
# min_probability: 0.1
# max_probability: 0.2
# frequency: 42
# means that there are 42 records for which δ is in [0.1, 0.2). An
# important particular case is when min_probability = max_probability = 1:
# then, every individual who shares this quasi-identifier combination is in
# the dataset.
"bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total
# number of classes returned per bucket is capped at 20.
{ # A tuple of values for the quasi-identifier columns.
"quasiIdsValues": [ # The quasi-identifier values.
{ # Set of primitive values supported by the system.
# Note that for the purposes of inspection or transformation, the number
# of bytes considered to comprise a 'Value' is based on its representation
# as a UTF-8 encoded string. For example, if 'integer_value' is set to
# 123456789, the number of bytes would be counted as 9, even though an
# int64 only holds up to 8 bytes of data.
"floatValue": 3.14,
"timestampValue": "A String",
"dayOfWeekValue": "A String",
"timeValue": { # Represents a time of day. The date and time zone are either not significant
# or are specified elsewhere. An API may choose to allow leap seconds. Related
# types are google.type.Date and `google.protobuf.Timestamp`.
"hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose
# to allow the value "24:00:00" for scenarios like business closing time.
"nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999.
"seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may
# allow the value 60 if it allows leap-seconds.
"minutes": 42, # Minutes of hour of day. Must be from 0 to 59.
},
"dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day
# and time zone are either specified elsewhere or are not significant. The date
# is relative to the Proleptic Gregorian Calendar. This can represent:
#
# * A full date, with non-zero year, month and day values
# * A month and day value, with a zero year, e.g. an anniversary
# * A year on its own, with zero month and day values
# * A year and month value, with a zero day, e.g. a credit card expiration date
#
# Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`.
"year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without
# a year.
"day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0
# if specifying a year by itself or a year and month where the day is not
# significant.
"month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a
# month and day.
},
"stringValue": "A String",
"booleanValue": True or False,
"integerValue": "A String",
},
],
"estimatedProbability": 3.14, # The estimated probability that a given individual sharing these
# quasi-identifier values is in the dataset. This value, typically called
# δ, is the ratio between the number of records in the dataset with these
# quasi-identifier values, and the total number of individuals (inside
# *and* outside the dataset) with these quasi-identifier values.
# For example, if there are 15 individuals in the dataset who share the
# same quasi-identifier values, and an estimated 100 people in the entire
# population with these values, then δ is 0.15.
},
],
"bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket.
"bucketSize": "A String", # Number of records within these probability bounds.
"maxProbability": 3.14, # Always greater than or equal to min_probability.
"minProbability": 3.14, # Between 0 and 1.
},
],
},
"requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"state": "A String", # State of a job.
"jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that
# instantiated the job.
"startTime": "A String", # Time when the job started.
"endTime": "A String", # Time when the job finished.
"type": "A String", # The type of job.
"createTime": "A String", # Time when the job was created.
}</pre>
</div>
<div class="method">
<code class="details" id="create">create(parent, body, x__xgafv=None)</code>
<pre>Creates a job trigger to run DLP actions such as scanning storage for
sensitive information on a set schedule.
See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Args:
parent: string, The parent resource name, for example projects/my-project-id. (required)
body: object, The request body. (required)
The object takes the form of:
{ # Request message for CreateJobTrigger.
"triggerId": "A String", # The trigger id can contain uppercase and lowercase letters,
# numbers, and hyphens; that is, it must match the regular
# expression: `[a-zA-Z\\d-_]+`. The maximum length is 100
# characters. Can be empty to allow the system to generate one.
"jobTrigger": { # Contains a configuration to make dlp api calls on a repeating basis. # The JobTrigger to create.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
},
}
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Contains a configuration to make dlp api calls on a repeating basis.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
}</pre>
</div>
<div class="method">
<code class="details" id="delete">delete(name, x__xgafv=None)</code>
<pre>Deletes a job trigger.
See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Args:
name: string, Resource name of the project and the triggeredJob, for example
`projects/dlp-test-project/jobTriggers/53234423`. (required)
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # A generic empty message that you can re-use to avoid defining duplicated
# empty messages in your APIs. A typical example is to use it as the request
# or the response type of an API method. For instance:
#
# service Foo {
# rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty);
# }
#
# The JSON representation for `Empty` is empty JSON object `{}`.
}</pre>
</div>
<div class="method">
<code class="details" id="get">get(name, x__xgafv=None)</code>
<pre>Gets a job trigger.
See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Args:
name: string, Resource name of the project and the triggeredJob, for example
`projects/dlp-test-project/jobTriggers/53234423`. (required)
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Contains a configuration to make dlp api calls on a repeating basis.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
}</pre>
</div>
<div class="method">
<code class="details" id="list">list(parent, orderBy=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</code>
<pre>Lists job triggers.
See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Args:
parent: string, The parent resource name, for example `projects/my-project-id`. (required)
orderBy: string, Optional comma separated list of triggeredJob fields to order by,
followed by `asc` or `desc` postfix. This list is case-insensitive,
default sorting order is ascending, redundant space characters are
insignificant.
Example: `name asc,update_time, create_time desc`
Supported fields are:
- `create_time`: corresponds to time the JobTrigger was created.
- `update_time`: corresponds to time the JobTrigger was last updated.
- `last_run_time`: corresponds to the last time the JobTrigger ran.
- `name`: corresponds to JobTrigger's name.
- `display_name`: corresponds to JobTrigger's display name.
- `status`: corresponds to JobTrigger's status.
pageSize: integer, Optional size of the page, can be limited by a server.
pageToken: string, Optional page token to continue retrieval. Comes from previous call
to ListJobTriggers. `order_by` field must not
change for subsequent calls.
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
filter: string, Optional. Allows filtering.
Supported syntax:
* Filter expressions are made up of one or more restrictions.
* Restrictions can be combined by `AND` or `OR` logical operators. A
sequence of restrictions implicitly uses `AND`.
* A restriction has the form of `<field> <operator> <value>`.
* Supported fields/values for inspect jobs:
- `status` - HEALTHY|PAUSED|CANCELLED
- `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY
- 'last_run_time` - RFC 3339 formatted timestamp, surrounded by
quotation marks. Nanoseconds are ignored.
- 'error_count' - Number of errors that have occurred while running.
* The operator must be `=` or `!=` for status and inspected_storage.
Examples:
* inspected_storage = cloud_storage AND status = HEALTHY
* inspected_storage = cloud_storage OR inspected_storage = bigquery
* inspected_storage = cloud_storage AND (state = PAUSED OR state = HEALTHY)
* last_run_time > \"2017-12-12T00:00:00+00:00\"
The length of this field should be no more than 500 characters.
Returns:
An object of the form:
{ # Response message for ListJobTriggers.
"nextPageToken": "A String", # If the next page is available then the next page token to be used
# in following ListJobTriggers request.
"jobTriggers": [ # List of triggeredJobs, up to page_size in ListJobTriggersRequest.
{ # Contains a configuration to make dlp api calls on a repeating basis.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
},
],
}</pre>
</div>
<div class="method">
<code class="details" id="list_next">list_next(previous_request, previous_response)</code>
<pre>Retrieves the next page of results.
Args:
previous_request: The request for the previous page. (required)
previous_response: The response from the request for the previous page. (required)
Returns:
A request object that you can call 'execute()' on to request the next
page. Returns None if there are no more items in the collection.
</pre>
</div>
<div class="method">
<code class="details" id="patch">patch(name, body, x__xgafv=None)</code>
<pre>Updates a job trigger.
See https://cloud.google.com/dlp/docs/creating-job-triggers to learn more.
Args:
name: string, Resource name of the project and the triggeredJob, for example
`projects/dlp-test-project/jobTriggers/53234423`. (required)
body: object, The request body. (required)
The object takes the form of:
{ # Request message for UpdateJobTrigger.
"jobTrigger": { # Contains a configuration to make dlp api calls on a repeating basis. # New JobTrigger value.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
},
"updateMask": "A String", # Mask to control which fields get updated.
}
x__xgafv: string, V1 error format.
Allowed values
1 - v1 error format
2 - v2 error format
Returns:
An object of the form:
{ # Contains a configuration to make dlp api calls on a repeating basis.
# See https://cloud.google.com/dlp/docs/concepts-job-triggers to learn more.
"status": "A String", # A status for this trigger. [required]
"updateTime": "A String", # The last update timestamp of a triggeredJob, output only field.
"errors": [ # A stream of errors encountered when the trigger was activated. Repeated
# errors may result in the JobTrigger automatically being paused.
# Will return the last 100 errors. Whenever the JobTrigger is modified
# this list will be cleared. Output only field.
{ # Details information about an error encountered during job execution or
# the results of an unsuccessful activation of the JobTrigger.
# Output only field.
"timestamps": [ # The times the error occurred.
"A String",
],
"details": { # The `Status` type defines a logical error model that is suitable for
# different programming environments, including REST APIs and RPC APIs. It is
# used by [gRPC](https://github.com/grpc). Each `Status` message contains
# three pieces of data: error code, error message, and error details.
#
# You can find out more about this error model and how to work with it in the
# [API Design Guide](https://cloud.google.com/apis/design/errors).
"message": "A String", # A developer-facing error message, which should be in English. Any
# user-facing error message should be localized and sent in the
# google.rpc.Status.details field, or localized by the client.
"code": 42, # The status code, which should be an enum value of google.rpc.Code.
"details": [ # A list of messages that carry the error details. There is a common set of
# message types for APIs to use.
{
"a_key": "", # Properties of the object. Contains field @type with type URL.
},
],
},
},
],
"displayName": "A String", # Display name (max 100 chars)
"description": "A String", # User provided description (max 256 chars)
"inspectJob": {
"storageConfig": { # Shared message indicating Cloud storage type. # The data to scan.
"datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification.
"partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
# A partition ID identifies a grouping of entities. The grouping is always
# by project and namespace, however the namespace ID may be empty.
#
# A partition ID contains several dimensions:
# project ID and namespace ID.
"projectId": "A String", # The ID of the project to which the entities belong.
"namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong.
},
"kind": { # A representation of a Datastore kind. # The kind to process.
"name": "A String", # The name of the kind.
},
},
"bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification.
"excludedFields": [ # References to fields excluded from scanning. This allows you to skip
# inspection of entire columns which you know have no findings.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the
# rest of the rows are omitted. If not set, or if set to 0, all rows will be
# scanned. Only one of rows_limit and rows_limit_percent can be specified.
# Cannot be used in conjunction with TimespanConfig.
"sampleMethod": "A String",
"identifyingFields": [ # References to fields uniquely identifying rows within the table.
# Nested fields in the format, like `person.birthdate.year`, are allowed.
{ # General identifier of a data field in a storage service.
"name": "A String", # Name describing the field.
},
],
"rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows
# scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and
# 100 means no limit. Defaults to 0. Only one of rows_limit and
# rows_limit_percent can be specified. Cannot be used in conjunction with
# TimespanConfig.
"tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
},
"timespanConfig": { # Configuration of the timespan of the items to include in scanning.
# Currently only supported when inspecting Google Cloud Storage and BigQuery.
"timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items.
# Used for data sources like Datastore or BigQuery.
# If not specified for BigQuery, table last modification timestamp
# is checked against given time span.
# The valid data types of the timestamp field are:
# for BigQuery - timestamp, date, datetime;
# for Datastore - timestamp.
# Datastore entity will be scanned if the timestamp property does not exist
# or its value is empty or invalid.
"name": "A String", # Name describing the field.
},
"endTime": "A String", # Exclude files or rows newer than this value.
# If set to zero, no upper time limit is applied.
"startTime": "A String", # Exclude files or rows older than this value.
"enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out
# a valid start_time to avoid scanning files that have not been modified
# since the last time the JobTrigger executed. This will be based on the
# time of the execution of the last run of the JobTrigger.
},
"cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification.
# bucket.
"bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger
# than this value then the rest of the bytes are omitted. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"sampleMethod": "A String",
"fileSet": { # Set of files to scan. # The set of one or more files to scan.
"url": "A String", # The Cloud Storage url of the file(s) to scan, in the format
# `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed.
#
# If the url ends in a trailing slash, the bucket or directory represented
# by the url will be scanned non-recursively (content in sub-directories
# will not be scanned). This means that `gs://mybucket/` is equivalent to
# `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to
# `gs://mybucket/directory/*`.
#
# Exactly one of `url` or `regex_file_set` must be set.
"regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or
# `regex_file_set` must be set.
# expressions are used to allow fine-grained control over which files in the
# bucket to include.
#
# Included files are those that match at least one item in `include_regex` and
# do not match any items in `exclude_regex`. Note that a file that matches
# items from both lists will _not_ be included. For a match to occur, the
# entire file path (i.e., everything in the url after the bucket name) must
# match the regular expression.
#
# For example, given the input `{bucket_name: "mybucket", include_regex:
# ["directory1/.*"], exclude_regex:
# ["directory1/excluded.*"]}`:
#
# * `gs://mybucket/directory1/myfile` will be included
# * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches
# across `/`)
# * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the
# full path doesn't match any items in `include_regex`)
# * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path
# matches an item in `exclude_regex`)
#
# If `include_regex` is left empty, it will match all files by default
# (this is equivalent to setting `include_regex: [".*"]`).
#
# Some other common use cases:
#
# * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all
# files in `mybucket` except for .pdf files
# * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will
# include all files directly under `gs://mybucket/directory/`, without matching
# across `/`
"excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in
# the bucket that match at least one of these regular expressions will be
# excluded from the scan.
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
"bucketName": "A String", # The name of a Cloud Storage bucket. Required.
"includeRegex": [ # A list of regular expressions matching file paths to include. All files in
# the bucket that match at least one of these regular expressions will be
# included in the set of files, except for those that also match an item in
# `exclude_regex`. Leaving this field empty will match all files by default
# (this is equivalent to including `.*` in the list).
#
# Regular expressions use RE2
# [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found
# under the google/re2 repository on GitHub.
"A String",
],
},
},
"bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The
# number of bytes scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one
# of bytes_limit_per_file and bytes_limit_per_file_percent can be specified.
"filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet.
# Number of files scanned is rounded down. Must be between 0 and 100,
# inclusively. Both 0 and 100 means no limit. Defaults to 0.
"fileTypes": [ # List of file type groups to include in the scan.
# If empty, all files are scanned and available data format processors
# are applied. In addition, the binary content of the selected files
# is always scanned as well.
"A String",
],
},
},
"inspectConfig": { # Configuration description of the scanning process. # How and what to scan for.
# When used with redactContent only info_types and min_likelihood are currently
# used.
"excludeInfoTypes": True or False, # When true, excludes type information of the findings.
"limits": {
"maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job.
# When set within `InspectContentRequest`, the maximum returned is 2000
# regardless if this is set higher.
"maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes.
{ # Max findings configuration per infoType, per content item or long
# running DlpJob.
"infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per
# info_type should be provided. If InfoTypeLimit does not have an
# info_type, the DLP API applies the limit against all info_types that
# are found but not specified in another InfoTypeLimit.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"maxFindings": 42, # Max findings limit for the given infoType.
},
],
"maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned.
# When set within `InspectDataSourceRequest`,
# the maximum returned is 2000 regardless if this is set higher.
# When set within `InspectContentRequest`, this field is ignored.
},
"minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is
# POSSIBLE.
# See https://cloud.google.com/dlp/docs/likelihood to learn more.
"customInfoTypes": [ # CustomInfoTypes provided by the user. See
# https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more.
{ # Custom information type provided by the user. Used to find domain-specific
# sensitive information configurable to the data in question.
"regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that
# support reversing.
# such as
# [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig).
# These types of transformations are
# those that perform pseudonymization, thereby producing a "surrogate" as
# output. This should be used in conjunction with a field on the
# transformation such as `surrogate_info_type`. This CustomInfoType does
# not support the use of `detection_rules`.
},
"infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in
# infoType, when the name matches one of existing infoTypes and that infoType
# is specified in `InspectContent.info_types` field. Specifying the latter
# adds findings to the one detected by the system. If built-in info type is
# not specified in `InspectContent.info_types` list then the name is treated
# as a custom info type.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in
# `InspectDataSource`. Not currently supported in `InspectContent`.
"name": "A String", # Resource name of the requested `StoredInfoType`, for example
# `organizations/433245324/storedInfoTypes/432452342` or
# `projects/project-id/storedInfoTypes/432452342`.
"createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for
# inspection was created. Output-only field, populated by the system.
},
"detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType.
# Rules are applied in order that they are specified. Not supported for the
# `surrogate_type` CustomInfoType.
{ # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a
# `CustomInfoType` to alter behavior under certain circumstances, depending
# on the specific details of the rule. Not supported for the `surrogate_type`
# custom infoType.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
},
],
"exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding
# to be returned. It still can be used for rules matching.
"likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be
# altered by a detection rule if the finding meets the criteria specified by
# the rule. Defaults to `VERY_LIKELY` if not specified.
},
],
"includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is
# included in the response; see Finding.quote.
"ruleSet": [ # Set of rules to apply to the findings for this InspectConfig.
# Exclusion rules, contained in the set are executed in the end, other
# rules are executed in the order they are specified for each info type.
{ # Rule set for modifying a set of infoTypes to alter behavior under certain
# circumstances, depending on the specific details of the rules within the set.
"rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order.
{ # A single inspection rule to be applied to infoTypes, specified in
# `InspectionRuleSet`.
"hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule.
# proximity of hotwords.
"proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside.
# The total length of the window cannot exceed 1000 characters. Note that
# the finding itself will be included in the window, so that hotwords may
# be used to match substrings of the finding itself. For example, the
# certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be
# adjusted upwards if the area code is known to be the local area code of
# a company office using the hotword regex "\(xxx\)", where "xxx"
# is the area code in question.
# rule.
"windowAfter": 42, # Number of characters after the finding to consider.
"windowBefore": 42, # Number of characters before the finding to consider.
},
"hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings.
# part of a detection rule.
"relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of
# levels. For example, if a finding would be `POSSIBLE` without the
# detection rule and `relative_likelihood` is 1, then it is upgraded to
# `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`.
# Likelihood may never drop below `VERY_UNLIKELY` or exceed
# `VERY_LIKELY`, so applying an adjustment of 1 followed by an
# adjustment of -1 when base likelihood is `VERY_LIKELY` will result in
# a final likelihood of `LIKELY`.
"fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value.
},
},
"exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule.
# `InspectionRuleSet` are removed from results.
"regex": { # Message defining a custom regular expression. # Regular expression which defines the rule.
"pattern": "A String", # Pattern defining the regular expression. Its syntax
# (https://github.com/google/re2/wiki/Syntax) can be found under the
# google/re2 repository on GitHub.
"groupIndexes": [ # The index of the submatch to extract as findings. When not
# specified, the entire match is returned. No more than 3 may be included.
42,
],
},
"excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule.
"infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or
# contained within with a finding of an infoType from this list. For
# example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and
# `exclusion_rule` containing `exclude_info_types.info_types` with
# "EMAIL_ADDRESS" the phone number findings are dropped if they overlap
# with EMAIL_ADDRESS finding.
# That leads to "555-222-2222@example.org" to generate only a single
# finding, namely email address.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule.
# be used to match sensitive information specific to the data, such as a list
# of employee IDs or job titles.
#
# Dictionary words are case-insensitive and all characters other than letters
# and digits in the unicode [Basic Multilingual
# Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane)
# will be replaced with whitespace when scanning for matches, so the
# dictionary phrase "Sam Johnson" will match all three phrases "sam johnson",
# "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters
# surrounding any match must be of a different type than the adjacent
# characters within the word, so letters must be next to non-letters and
# digits next to non-digits. For example, the dictionary word "jen" will
# match the first three letters of the text "jen123" but will return no
# matches for "jennifer".
#
# Dictionary words containing a large number of characters that are not
# letters or digits may result in unexpected findings because such characters
# are treated as whitespace. The
# [limits](https://cloud.google.com/dlp/limits) page contains details about
# the size limits of dictionaries. For dictionaries that do not fit within
# these constraints, consider using `LargeCustomDictionaryConfig` in the
# `StoredInfoType` API.
"wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for.
"words": [ # Words or phrases defining the dictionary. The dictionary must contain
# at least one phrase and every phrase must contain at least 2 characters
# that are letters or digits. [required]
"A String",
],
},
"cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file
# is accepted.
"path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage.
# Example: gs://[BUCKET_NAME]/dictionary.txt
},
},
"matchingType": "A String", # How the rule is applied, see MatchingType documentation for details.
},
},
],
"infoTypes": [ # List of infoTypes this rule set is applied to.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
],
"contentOptions": [ # List of options defining data content to scan.
# If empty, text, images, and other content will be included.
"A String",
],
"infoTypes": [ # Restricts what info_types to look for. The values must correspond to
# InfoType values returned by ListInfoTypes or listed at
# https://cloud.google.com/dlp/docs/infotypes-reference.
#
# When no InfoTypes or CustomInfoTypes are specified in a request, the
# system may automatically choose what detectors to run. By default this may
# be all types, but may change over time as detectors are updated.
#
# The special InfoType name "ALL_BASIC" can be used to trigger all detectors,
# but may change over time as new InfoTypes are added. If you need precise
# control and predictability as to what detectors are run you should specify
# specific InfoTypes listed in the reference.
{ # Type of information detected by the API.
"name": "A String", # Name of the information type. Either a name of your choosing when
# creating a CustomInfoType, or one of the names listed
# at https://cloud.google.com/dlp/docs/infotypes-reference when specifying
# a built-in type. InfoType names should conform to the pattern
# [a-zA-Z0-9_]{1,64}.
},
],
},
"inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig.
# `inspect_config` will be merged into the values persisted as part of the
# template.
"actions": [ # Actions to execute at the completion of the job.
{ # A task to execute on the completion of a job.
# See https://cloud.google.com/dlp/docs/concepts-actions to learn more.
"saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location.
# OutputStorageConfig. Only a single instance of this action can be
# specified.
# Compatible with: Inspect, Risk
"outputConfig": { # Cloud repository for storing output.
"table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing
# dataset. If table_id is not set a new one will be generated
# for you with the following format:
# dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for
# generating the date details.
#
# For Inspect, each column in an existing output table must have the same
# name, type, and mode of a field in the `Finding` object.
#
# For Risk, an existing output table should be the output of a previous
# Risk analysis job run on the same source table, with the same privacy
# metric and quasi-identifiers. Risk jobs that analyze the same table but
# compute a different privacy metric, or use different sets of
# quasi-identifiers, cannot store their results in the same table.
# identified by its project_id, dataset_id, and table_name. Within a query
# a table is often referenced with a string in the format of:
# `<project_id>:<dataset_id>.<table_id>` or
# `<project_id>.<dataset_id>.<table_id>`.
"projectId": "A String", # The Google Cloud Platform project ID of the project containing the table.
# If omitted, project ID is inferred from the API call.
"tableId": "A String", # Name of the table.
"datasetId": "A String", # Dataset ID of the table.
},
"outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only
# used for Inspect and must be unspecified for Risk jobs. Columns are derived
# from the `Finding` object. If appending to an existing table, any columns
# from the predefined schema that are missing will be added. No columns in
# the existing table will be deleted.
#
# If unspecified, then all available columns will be used for a new table or
# an (existing) table with no schema, and no changes will be made to an
# existing table that has a schema.
},
},
"jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's
# completion/failure.
# completion/failure.
},
"publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha).
# Command Center (CSCC Alpha).
# This action is only available for projects which are parts of
# an organization and whitelisted for the alpha Cloud Security Command
# Center.
# The action will publish count of finding instances and their info types.
# The summary of findings will be persisted in CSCC and are governed by CSCC
# service-specific policy, see https://cloud.google.com/terms/service-terms
# Only a single instance of this action can be specified.
# Compatible with: Inspect
},
"pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic.
# message contains a single field, `DlpJobName`, which is equal to the
# finished job's
# [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob).
# Compatible with: Inspect, Risk
"topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given
# publishing access rights to the DLP API service account executing
# the long running DlpJob sending the notifications.
# Format is projects/{project}/topics/{topic}.
},
},
],
},
"triggers": [ # A list of triggers which will be OR'ed together. Only one in the list
# needs to trigger for a job to be started. The list may contain only
# a single Schedule trigger and must have at least one object.
{ # What event needs to occur for a new job to be started.
"schedule": { # Schedule for triggeredJobs. # Create a job on a repeating basis based on the elapse of time.
"recurrencePeriodDuration": "A String", # With this option a job is started a regular periodic basis. For
# example: every day (86400 seconds).
#
# A scheduled start time will be skipped if the previous
# execution has not ended when its scheduled time occurs.
#
# This value must be set to a time duration greater than or equal
# to 1 day and can be no longer than 60 days.
},
},
],
"lastRunTime": "A String", # The timestamp of the last time this trigger executed, output only field.
"createTime": "A String", # The creation timestamp of a triggeredJob, output only field.
"name": "A String", # Unique resource name for the triggeredJob, assigned by the service when the
# triggeredJob is created, for example
# `projects/dlp-test-project/triggeredJobs/53234423`.
}</pre>
</div>
</body></html>