Administration¶
Detailed configuration reference for HCP namespaces, protocols, content metadata, and operational concerns. See HCP Concepts for the introductory overview.
Content Properties¶
Content classes group content properties — typed fields extracted from custom metadata XML using XPath expressions. When objects are ingested, HCP reads the custom metadata XML, evaluates the XPath expressions, and indexes the extracted values for fast search.
How Content Indexing Works¶
graph LR
OBJ["Object with<br/>custom metadata XML"]
CC["Content Class<br/>defines properties"]
XPATH["XPath Extraction<br/>evaluate expressions"]
IDX["Search Index<br/>queryable fields"]
OBJ --> XPATH
CC --> XPATH
XPATH --> IDX
For example, given this custom metadata XML on an ingested medical image:
<dicom_image>
<doctor>
<name>Dr. Lindqvist</name>
<department>Radiology</department>
</doctor>
<study_date>2024-03-15</study_date>
</dicom_image>
A content class with properties like doctor_name (XPath: /dicom_image/doctor/name) and study_date (XPath: /dicom_image/study_date) would extract those values and make them searchable via the Metadata Query API.
Property Definition¶
| Field | Description |
|---|---|
name |
1–25 characters, alphanumeric + underscore, case-sensitive. |
expression |
XPath to extract the value (e.g., /dicom_image/doctor/name). Can include annotation prefixes: @annot-name:xpath-expression. |
type |
STRING, INTEGER, BOOLEAN, DATE, FLOAT, FULLTEXT. |
multivalued |
Whether the property can have multiple values per object. |
format |
Date/number format pattern (e.g., MM/dd/yyyy for DATE). |
Custom Metadata Indexing Settings¶
Each namespace can configure what custom metadata gets indexed:
| Setting | Description |
|---|---|
customMetadataIndexingEnabled |
Master switch for indexing. Must be true for any indexing to occur. |
fullIndexingEnabled |
Index the full text of custom metadata XML (for customMetadataContent searches). |
contentClasses |
List of content classes to apply when objects are ingested. |
excludedAnnotations |
Wildcard patterns for annotations to skip (e.g., misc*, email). |
Namespace Configuration¶
Namespaces have several settings that are permanent — they cannot be changed after creation. Understanding these decisions upfront prevents surprises later.
Permanent vs Mutable Settings¶
graph TD
subgraph "Set at creation — PERMANENT"
OPT["Optimization Mode<br/>CLOUD or ALL"]
HASH["Hash Scheme<br/>MD5, SHA-256, etc."]
end
subgraph "Can change anytime"
QUOTA["Quotas"]
ACL["ACL mode"]
VERS["Versioning"]
PROTO["Protocol settings"]
end
subgraph "One-way switches"
COMP["Enterprise → Compliance<br/>(cannot go back)"]
ACL_EN["ACLs disabled → enabled<br/>(cannot disable again)"]
end
Optimization Mode¶
The optimization mode affects internal storage layout and determines which access protocols are available:
| Mode | Value | Description |
|---|---|---|
| Cloud | CLOUD |
Optimized for S3/REST protocols only. Required for erasure coding. Supports balanced or unbalanced directory usage. |
| All protocols | ALL |
Supports all access protocols including NFS, CIFS, SMTP, WebDAV. Cannot use erasure coding. |
Cannot be changed after namespace creation. Choose based on your access pattern requirements.
Hash Scheme¶
Each namespace uses a cryptographic hash algorithm for object integrity verification. Set at creation time and cannot be changed later:
MD5, SHA-1, SHA-256, SHA-384, SHA-512, RIPEMD-160
Values are case-sensitive. SHA-256 is recommended for new namespaces.
Quotas¶
| Quota | Description |
|---|---|
| Hard quota | Maximum storage in decimal format with units (e.g., 50 GB, 1.25 TB). Includes data, metadata, and redundancy overhead from the service plan. Minimum 1 GB or 0.01 TB. |
| Soft quota | Percentage (10–95%) of the hard quota. When storage exceeds the soft quota, HCP generates a warning notification. Default: 85%. |
graph LR
subgraph "50 GB Hard Quota"
USED["Used: 35 GB"]
SOFT["Soft Quota (85%): 42.5 GB"]
HARD["Hard Quota: 50 GB"]
end
USED -->|"warning at"| SOFT
SOFT -->|"blocked at"| HARD
Multipart Upload Auto-Abort¶
multipartUploadAutoAbortDays controls when incomplete multipart uploads are automatically cleaned up (0–180 days, default: 30). Setting to 0 means never auto-abort — incomplete uploads accumulate indefinitely, consuming storage.
ACL Support¶
aclsUsage has three states. The transition between them is restricted — notably, once ACLs are enabled, they can never be disabled.
stateDiagram-v2
NOT_ENABLED --> ENFORCED: Enable + enforce
NOT_ENABLED --> NOT_ENFORCED: Enable without enforcing
ENFORCED --> NOT_ENFORCED: Stop enforcing
NOT_ENFORCED --> ENFORCED: Start enforcing
note right of NOT_ENABLED: Cannot return here\nonce ACLs are enabled
| State | Description |
|---|---|
NOT_ENABLED |
ACLs disabled. |
ENFORCED |
ACLs required and checked for access decisions. |
NOT_ENFORCED |
ACLs stored but not checked for access decisions. |
Protocol Details¶
HCP supports multiple access protocols, each with its own configuration. All protocols access the same underlying data — an object stored via S3 is immediately visible via NFS, CIFS, and other enabled protocols.
CIFS/SMB¶
CIFS (Common Internet File System) provides Windows file share access to namespace data:
| Setting | Description |
|---|---|
caseSensitive |
HCP is natively case-sensitive; CIFS can override this (default: true). |
caseForcing |
Force filenames to uppercase, lowercase, or disabled. |
requiresAuthentication |
When true, requires Active Directory authentication. |
NFS¶
NFS access uses POSIX-style ownership:
| Setting | Description |
|---|---|
uid / gid |
Default POSIX user/group IDs for objects (default: 0 / root). |
| IP access | NFS only supports IP allow lists (no deny list). |
SMTP¶
HCP can ingest email directly via SMTP — useful for email archiving:
| Setting | Description |
|---|---|
emailFormat |
.eml or .mbox (default: .eml). |
emailLocation |
Directory path for email storage (default: /email/; HCP auto-creates it). |
separateAttachments |
Store email attachments as separate objects alongside the email. |
IP-Based Access Control¶
Each protocol has independent ipSettings that control network-level access:
| Setting | Description |
|---|---|
allowAddresses |
IPs/CIDRs allowed access (e.g., 192.168.100.0/24). Default: 0.0.0.0/0 (all). |
denyAddresses |
IPs/CIDRs denied access. Default: empty. |
allowIfInBothLists |
When true, IPs appearing in both lists (or neither) are allowed. When false, they are denied. |
graph TD
REQ["Incoming Request<br/>from 10.0.1.5"]
ALLOW{"In allow list?"}
DENY{"In deny list?"}
BOTH{"In both lists?"}
FLAG{"allowIfInBothLists?"}
REQ --> ALLOW
ALLOW -->|"Yes"| DENY
ALLOW -->|"No"| FLAG
DENY -->|"No"| GRANT["Access Granted"]
DENY -->|"Yes"| BOTH
BOTH --> FLAG
FLAG -->|"true"| GRANT
FLAG -->|"false"| BLOCK["Access Denied"]
CORS Configuration¶
HCP supports Cross-Origin Resource Sharing at two levels, with a clear inheritance hierarchy:
graph TD
TC["Tenant CORS Config<br/>(default for all namespaces)"]
NS1["Namespace A<br/>(no CORS config)"]
NS2["Namespace B<br/>(has own CORS config)"]
TC -->|"inherits"| NS1
TC -.-x|"overridden"| NS2
NS2 -->|"uses own config"| NS2_RULES["Namespace B rules"]
Inheritance rules:
- If a namespace has its own CORS configuration → tenant-level rules are completely ignored for that namespace
- If a namespace has no CORS configuration → tenant-level rules apply
- If neither is configured → CORS requests are rejected
CORS Rule Structure¶
CORS rules are configured as XML:
<CORSConfiguration>
<CORSRule>
<AllowedOrigin>https://app.example.com</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedMethod>POST</AllowedMethod>
<AllowedHeader>*</AllowedHeader>
</CORSRule>
</CORSConfiguration>
| Operation | Description |
|---|---|
PUT on the CORS resource |
Sets the configuration (replaces any existing). |
GET |
Retrieves the current configuration. |
DELETE |
Removes the configuration (returns 404 if none exists). |
Chargeback Reporting¶
Chargeback reports provide detailed storage usage metrics per namespace, used for cost allocation across departments or customers.
Report Parameters¶
| Parameter | Description |
|---|---|
start |
Start time in ISO 8601 with timezone (e.g., 2024-01-01T00:00:00-0500). |
end |
End time in ISO 8601 with timezone. |
granularity |
hour, day, or total (aggregate for the entire period). |
Metrics¶
| Metric | Description |
|---|---|
objectCount |
Total objects (each version counted separately; multipart counted as one). |
ingestedVolume |
Size before compression/dedup — the "original" data size. |
storageCapacityUsed |
Bytes including data, metadata, and service plan redundancy overhead. |
bytesIn / bytesOut |
Data transferred in/out during the interval. |
reads / writes / deletes |
Operation counts during the interval. |
multipartObjects / multipartObjectParts / multipartObjectBytes |
Completed multipart upload statistics. |
multipartUploads / multipartUploadParts / multipartUploadBytes |
In-progress multipart upload statistics. |
The difference between ingestedVolume and storageCapacityUsed is important for billing: ingestedVolume is what the user uploaded, while storageCapacityUsed includes the overhead of keeping multiple copies or fragments for data protection.
HCP retains 180 days of chargeback statistics. Limit chargeback report requests to once per hour — more frequent polling can cause system instability.
HCP Quirks and Gotchas¶
Behaviors that differ from standard S3 or may surprise developers:
| Behavior | Description |
|---|---|
| Bulk delete requires Content-MD5 | HCP validates Content-MD5 on DeleteObjects, but boto3 sends CRC32. The backend works around this with individual deletes. |
| 302 means "not found" | For HEAD requests, HCP returns 302 Found when a resource doesn't exist (or you lack permission). The backend maps this to 404. |
| Port 9090 is required for MAPI | Missing port 9090 in the URL returns 403 Forbidden, not a connection error. |
| Auto-adjusted dates | Invalid retention dates are silently corrected (November 33 → December 3) rather than rejected. |
| Version IDs are integers | HCP uses sequential integers (0, 1, 2...) instead of the UUIDs that AWS S3 uses. |
| Path-style addressing only | HCP requires path-style S3 URLs, not virtual-hosted style. |
| Region redirector crashes boto3 | HCP returns non-standard redirect responses that crash boto3's S3RegionRedirectorv2. The backend explicitly unregisters it. |
| Connection lifetime | HCP closes idle connections after 10 minutes. |
| Statistics polling | Limit GET requests for statistics to once per hour — more frequent polling can cause system instability. |
| Username in credentials | S3 access keys are base64(username), not arbitrary strings. Knowing someone's access key reveals their username. |
| One-way compliance mode | Switching a namespace from enterprise to compliance mode is permanent. |
| ACLs can't be disabled | Once ACLs are enabled on a namespace, they can never be turned off. |
| Hash scheme is permanent | The hash algorithm chosen at namespace creation cannot be changed. |