HCP Concepts¶
This page explains the core concepts of Hitachi Content Platform (HCP) that you'll encounter when using the HCP App.
How HCP Differs from Standard S3¶
If you're coming from AWS S3, MinIO, or Ceph, HCP will feel familiar on the data plane (it speaks S3) but very different on the management and compliance side.
| Concept | AWS S3 / MinIO | HCP |
|---|---|---|
| Hierarchy | Flat: just buckets | Multi-tenant: System → Tenants → Namespaces (= buckets) |
| Management API | Same S3 API for data + config | Separate MAPI (REST/XML on port 9090) for all admin operations |
| Access protocols | S3 only | S3, REST, NFS, CIFS/SMB, SMTP, WebDAV — all accessing the same data |
| Version IDs | UUIDs ("aBcDeFgH...") |
Integers (0, 1, 2, ...) |
| Authentication | Access key + secret key (arbitrary) | base64(username) + md5(password) — derived from user credentials |
| Bucket addressing | Virtual-hosted or path-style | Path-style only |
| Retention | S3 Object Lock (Governance/Compliance) | Built-in WORM + retention classes + S3 Object Lock |
| Search | External (Athena, OpenSearch) | Built-in Metadata Query API with Lucene-like syntax |
| Content classes | No equivalent | Schema-based custom metadata with typed, indexed properties |
| Replication | Cross-region replication rules | Built-in active/active geo-replication between HCP systems |
| Bulk delete | DeleteObjects with CRC32 |
Requires Content-MD5 — this app works around it with individual deletes |
Tip
The HCP App abstracts most of these differences. The S3 data plane handles HCP quirks (path-style, credential derivation, bulk delete workaround) transparently, while MAPI operations are exposed through a unified REST API.
HCP-Specific S3 Headers¶
HCP extends the S3 API with custom headers for retention and compliance:
| Header | Description |
|---|---|
X-HCP-RETENTION |
Set/get retention setting (0 deletable, -1 permanent, -2 unspecified, or datetime). |
X-HCP-RETENTIONHOLD |
Set/get a simple hold (true/false). Prevents deletion regardless of retention. |
X-HCP-LABELRETENTIONHOLD |
Manage multiple named retention holds (up to 100 per object). |
X-HCP-PRIVILEGED |
Privileged delete — remove objects under retention. Requires PRIVILEGED permission. |
S3 Compatibility¶
HCP supports: bucket CRUD, object CRUD, ACLs, CORS, versioning, multipart uploads, presigned URLs, server-side encryption, Object Lock, and AWS Signature V2/V4. Not supported: lifecycle, tagging, policy, website, logging, notifications, metrics, analytics, inventory, replication config, encryption config, public access block, client-side encryption.
Object-Based Storage¶
HCP stores objects in a repository. Each object permanently associates data with metadata:
graph LR
subgraph "Object"
DATA["Fixed-Content Data<br/>(immutable once stored)"]
SYS["System Metadata<br/>size · creation date · retention"]
CUSTOM["Custom Metadata<br/>(optional annotations)"]
ACL["Access Control List<br/>(optional permissions)"]
end
| Component | Description |
|---|---|
| Fixed-content data | The actual file content. Once stored, it cannot be modified (WORM -- write once, read many). |
| System metadata | Automatically managed properties: size, creation date, retention policy, hash, etc. |
| Custom metadata | Optional user-provided annotations (XML or key-value pairs via x-amz-meta-* headers). |
| ACL | Optional access control list defining who can read, write, or manage the object. |
Tenants¶
A tenant is an administrative entity that owns and manages a portion of the HCP repository. Tenants typically correspond to organizations, departments, or business units.
graph TD
HCP["HCP System"] --> T1["Tenant: Finance"]
HCP --> T2["Tenant: Engineering"]
HCP --> T3["Tenant: Compliance"]
T1 --> NS1["Namespace: invoices"]
T1 --> NS2["Namespace: reports"]
T2 --> NS3["Namespace: builds"]
T2 --> NS4["Namespace: artifacts"]
T3 --> NS5["Namespace: audit-logs"]
Each tenant has:
- Quotas -- hard and soft storage limits
- User accounts -- tenant-scoped credentials with role-based access
- Configuration -- console security, email notifications, namespace defaults
- Statistics -- storage usage and chargeback reporting
Namespaces (Buckets)¶
A namespace (also called a bucket in S3 terminology) is a logical container for objects within a tenant. Objects in one namespace are not visible in any other namespace.
Namespaces provide:
- Isolation -- separate data for different applications or purposes
- Protocol configuration -- each namespace can enable/disable HTTP, S3, NFS, CIFS, SMTP independently
- Compliance settings -- retention policies, versioning, and hold rules per namespace
- Search indexing -- custom metadata indexing for the Metadata Query API
Buckets = Namespaces
In the S3 API, namespaces are called "buckets." The terms are interchangeable. When you create a bucket via S3, you're creating a namespace. When you manage namespaces via MAPI, you're managing buckets.
Versioning¶
HCP can store multiple versions of an object, providing a history of changes over time. Each version is a separate object with its own metadata.
| Concept | Description |
|---|---|
| Version ID | Integer identifier for each version (HCP uses integer IDs, not UUIDs like AWS S3). |
| Delete marker | When versioning is enabled, deleting an object creates a delete marker instead of removing data. |
| Pruning | Removing old versions of an object while keeping the current version. |
| Purging | Permanently removing all versions of an object, including the current version. |
Versioning is enabled per namespace and can be toggled between Enabled and Suspended.
Retention & Compliance¶
HCP provides WORM (Write Once, Read Many) storage with configurable retention policies to prevent premature deletion.
Retention Modes¶
| Mode | Value | Behavior |
|---|---|---|
| Deletion Allowed | 0 |
Object can be deleted at any time. |
| Deletion Prohibited | -1 |
Object can never be deleted (permanent retention). |
| Initial Unspecified | -2 |
Retention not yet set -- can be set later. |
| Fixed date | datetime | Object cannot be deleted until the specified date. |
| Offset | A+7y |
Retention calculated relative to ingest time (e.g., 7 years after creation). |
| Retention class | class name | Retention defined by a named class with a specific period. |
S3 Object Lock¶
HCP also supports S3 Object Lock for applications using the S3 API:
- Governance mode -- most users cannot delete, but users with
BypassGovernanceRetentionpermission can override. - Compliance mode -- no one can delete until the retention period expires, not even administrators.
- Legal holds -- prevent deletion regardless of retention settings. Up to 100 labeled holds per object.
Retention Classes¶
Retention classes are named policies defined at the namespace level. They simplify management by letting you assign a class name instead of a specific date to each object. When you update a retention class, all objects using that class are updated automatically.
Roles & Permissions¶
Tenant Roles¶
User accounts within a tenant are assigned one or more roles:
| Role | Permissions |
|---|---|
| ADMINISTRATOR | Full tenant administration: namespaces, users, settings, protocols. |
| SECURITY | Manage console security, search security, and user accounts. |
| MONITOR | Read-only access to statistics, chargeback reports, and configuration. |
| COMPLIANCE | Manage compliance settings, retention classes, and content classes. |
Data Access Permissions¶
In addition to roles, users can be granted data access permissions for individual namespaces:
| Permission | Description |
|---|---|
| BROWSE | List namespace contents. |
| READ | View and retrieve objects and their metadata. |
| WRITE | Store, copy, and modify objects. |
| DELETE | Delete objects and versions. |
| PURGE | Permanently remove all versions. |
| SEARCH | Search objects via the Metadata Query API. |
| READ_ACL | View object and bucket ACLs. |
| WRITE_ACL | Modify object and bucket ACLs. |
| CHANGE_OWNER | Change object ownership. |
| PRIVILEGED | Delete objects under retention (privileged delete). |
Protocols¶
HCP supports multiple access protocols, each configured independently per namespace:
| Protocol | Use Case |
|---|---|
| S3 | Primary API for modern applications. RESTful, compatible with AWS S3 tools. |
| REST | HCP-native HTTP API with additional metadata features. |
| NFS | Legacy file system access (mount namespaces as directories). |
| CIFS/SMB | Windows file sharing access. |
| SMTP | Email ingestion (storing email as objects). |
| WebDAV | Web-based file management. |
Tip
Objects stored through any protocol are immediately accessible through all other enabled protocols. The S3 and REST protocols are automatically enabled when you create a namespace via S3.
Replication¶
HCP supports cross-system replication for data protection across geographically distributed systems.
graph LR
DC1["HCP System<br/>Data Center 1"] <-->|"Replication Link"| DC2["HCP System<br/>Data Center 2"]
DC1 <-->|"Replication Link"| DC3["HCP System<br/>Data Center 3"]
Key concepts:
- Replication link -- a connection between two HCP systems for data synchronization.
- Active/active -- both systems accept writes; changes replicate bidirectionally.
- Failover/failback -- redirect traffic to a surviving system during outages.
- Geo-protection -- data is stored in multiple geographic locations for disaster recovery.
Metadata Query API¶
The Metadata Query API lets you search for objects based on their metadata. It supports:
- Object-based queries -- search current objects by system metadata, custom metadata, ACLs, and content properties using a Lucene-like query language.
- Operation-based queries -- search for create, delete, purge, and dispose events for audit trails.
The API returns metadata only (not object data). Results can be paginated and sorted.
Query Syntax¶
Property-based criteria use a Lucene-like syntax:
property:value # exact match
property:(value1 value2) # match any
property:(+value1 -value2) # must/must-not
property:[start TO end] # inclusive range
property:{start TO end} # exclusive range
property:[start TO *] # open-ended range
Boolean operators: +criterion (must match), -criterion (must not match), no operator (should match, affects ranking). Group with parentheses.
Wildcards: ? (single char), * (any chars) — valid at end/middle of terms only, never at the beginning.
Searchable Properties¶
| Property | Type | Description |
|---|---|---|
namespace |
string | namespace-name.tenant-name |
objectPath |
string | Full object path |
utf8Name |
string | Object filename |
size |
long | Object size in bytes |
contentType |
string | MIME type |
ingestTimeString |
datetime | Time object was stored |
changeTimeString |
datetime | Time object was last modified |
retention |
string | Retention setting (0, -1, -2, or datetime) |
retentionClass |
string | Assigned retention class name |
hold |
boolean | Whether object is on hold |
customMetadataContent |
text | Full-text search of custom metadata XML |
Query Examples¶
# Large files (>1MB) in the finance namespace
+(namespace:"finance.europe") +size:[1048576 TO *]
# PDFs ingested in 2025
+contentType:application/pdf +ingestTimeString:[2025-01-01T00:00:00 TO 2025-12-31T23:59:59]
# Objects with custom metadata containing "department" but not "foreign"
+customMetadataContent:(+"department" -"foreign")
# Objects on hold
+hold:true
Pagination: use count (max results, 1–10,000) and offset (skip N, max 100,000). For >100,000 results, paginate using changeTimeMilliseconds ranges. Response status COMPLETE means all results returned; INCOMPLETE means more pages available.
Authentication¶
HCP uses a token-based authentication scheme:
| Type | Header format | Description |
|---|---|---|
| HCP native | Authorization: HCP base64(username):md5(password) |
Username is base64-encoded, password is MD5-hashed. |
| Active Directory | Authorization: AD username@domain:password |
For AD-integrated environments. Plaintext credentials. |
The HCP App wraps this in a JWT-based flow -- see Authentication for details on how the API handles credential management.
MAPI Conventions¶
The Management API (port 9090) accepts XML or JSON request/response bodies. Key query parameters:
| Parameter | Description |
|---|---|
verbose=true |
Return all properties (default returns only modifiable ones). |
prettyprint |
Format response for readability (testing only). |
offset / count |
Pagination for list endpoints. |
Response headers always include X-HCP-SoftwareVersion; errors include X-HCP-ErrorMessage with human-readable details.
Content Classes¶
Content classes define custom metadata schemas for object classification. They map XML elements in custom metadata to named, typed properties that can be:
- Indexed for fast search via the Metadata Query API
- Used as facets for aggregate analysis
- Typed as string, integer, date, or boolean
Content classes are defined at the tenant level and associated with specific namespaces.
For detailed coverage of content properties, indexing settings, and XPath expressions, see Administration.
Further Reading¶
- Data Protection — Erasure coding, service plans, compliance modes, retention deep dive, and replication deep dive.
- Administration — Namespace configuration, protocol details, CORS, chargeback reporting, and HCP quirks.