feat: PostgreSQL support for DomainAudit #7665

joannalauu · 2026-02-02T03:15:40Z

What changed?
Implement SQL plugin interface and add PostgreSQL support for DomainAudit.

Created domain_audit_log table in Postgresql schema
Created data structures for DomainAuditLogRow and DomainAuditLogFilter in sqlplugin
Implemented sqlplugin interfaces for InsertIntoDomainAuditLog and SelectFromDomainAuditLogs functions
Implemented InsertIntoDomainAuditLog and SelectFromDomainAuditLogs functions in Postgresql
Updated SQL factory to create newSQLDomainAuditStore

Why?
DomainAudit allows all modifications made to a domain (e.g failovers) to be stored and retrieved. This has previously only been supported by NoSQL databases (Cassandra, MongoDB, DynamoDB). The SQL plugin is now configured to handle DomainAudit requests, so other SQL databases can also be easily configured to support DomainAudit. PostgreSQL support for DomainAudit is now added.

How did you test it?

Unit tests: go test -v ./common/persistence/sql -run TestCreateDomainAuditLog TestGetDomainAuditLogs TestFactoryNewDomainAuditStore TestInsertIntoDomainAuditLog TestSelectFromDomainAuditLogs TestGetDataBlobEncoding TestGetDataBlobBytes
Integration tests: Ran TestPostgresSQLDomainAuditPersistence on github actions

Potential risks

Postgresql schema is updated with a new table; it can be rolled back with Postgres database release version

Release notes
Added PostgreSQL support for DomainAudit

Documentation Changes
N/A

Potential Breaking change

Detailed Description
[In-depth description of the changes made to the schema or interfaces, specifying new fields, removed fields, or modified data structures]

Created domain_audit_log table in Postgresql schema
Created data structures for DomainAuditLogRow and DomainAuditLogFilter in sqlplugin
Implemented sqlplugin interfaces for InsertIntoDomainAuditLog and SelectFromDomainAuditLogs functions
Implemented InsertIntoDomainAuditLog and SelectFromDomainAuditLogs functions in Postgresql
Updated SQL factory to create newSQLDomainAuditStore

Impact Analysis

Backward Compatibility: Schema changes can be rolled back with Postgres database release version
Forward Compatibility: Doesn't affect existing schema

Testing Plan

Unit Tests: go test -v ./common/persistence/sql -run TestCreateDomainAuditLog TestGetDomainAuditLogs TestFactoryNewDomainAuditStore TestInsertIntoDomainAuditLog TestSelectFromDomainAuditLogs TestGetDataBlobEncoding TestGetDataBlobBytes
Persistence Tests: Ran TestPostgresSQLDomainAuditPersistence on github actions
Integration Tests: [Do we have integration test covering the change?]
Compatibility Tests: [Have we done tests to test the backward and forward compatibility?]

Rollout Plan

What is the rollout plan?
Does the order of deployment matter?
Is it safe to rollback? Does the order of rollback matter? Schema changes can be rolled back with Postgres database release version
Is there a kill switch to mitigate the impact immediately? Changing NewDomainAuditStore() to return nil, nil should revert effects of all changes

Reviewer Validation

PR Description Quality (check these before reviewing code):

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

- Created data structures of `DomainAuditLogRow` and `DomainAuditLogFilter` - Implemented sqlplugin interfaces for `InsertIntoDomainAuditLog` and `SelectFromDomainAuditLogs` functions - Implemented `InsertIntoDomainAuditLog` and `SelectFromDomainAuditLogs` functions in Postgresql - Updated SQL factory to create `newSQLDomainAuditStore` Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

common/persistence/sql/sqlplugin/mysql/domain_audit_log.go

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

c-warren · 2026-02-03T19:19:22Z

👋 Thanks! I've got this in my plans to review, unfortunately I've been blocked by some on-call work. Will get to this ASAP!

Shaddoll · 2026-02-03T22:26:28Z

common/persistence/sql/sqlplugin/postgres/domain_audit_log.go

+	argIndex := 5
+
+	// Handle pagination token
+	if filter.PageTokenMinCreatedTime != nil && filter.PageTokenMinEventID != nil {


the logic of handling pagination should be done inside sql_domain_audit_store.go to avoid duplicate logic inside each SQL plugin.
You can check the code in sql_execution_store.go to see how we handle pagination in that way.

Here is the relevant function:

cadence/common/persistence/sql/sql_execution_store.go

Lines 783 to 837 in 57b7157

func (m *sqlExecutionStore) ListConcreteExecutions(

ctx context.Context,

request *p.ListConcreteExecutionsRequest,

) (*p.InternalListConcreteExecutionsResponse, error) {

filter := &sqlplugin.ExecutionsFilter{}

if len(request.PageToken) > 0 {

err := gobDeserialize(request.PageToken, &filter)

if err != nil {

return nil, &types.InternalServiceError{

Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),

}

}

} else {

filter = &sqlplugin.ExecutionsFilter{

ShardID: m.shardID,

WorkflowID: "",

}

}

filter.Size = request.PageSize

executions, err := m.db.SelectFromExecutions(ctx, filter)

if err != nil {

if err == sql.ErrNoRows {

return &p.InternalListConcreteExecutionsResponse{}, nil

}

return nil, convertCommonErrors(m.db, "ListConcreteExecutions", "", err)

}

if len(executions) == 0 {

return &p.InternalListConcreteExecutionsResponse{}, nil

}

lastExecution := executions[len(executions)-1]

nextFilter := &sqlplugin.ExecutionsFilter{

ShardID: m.shardID,

WorkflowID: lastExecution.WorkflowID,

}

token, err := gobSerialize(nextFilter)

if err != nil {

return nil, &types.InternalServiceError{

Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),

}

}

concreteExecutions, err := m.populateInternalListConcreteExecutions(executions)

if err != nil {

return nil, &types.InternalServiceError{

Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),

}

}

return &p.InternalListConcreteExecutionsResponse{

Executions: concreteExecutions,

NextPageToken: token,

}, nil

}

At the sql level we perform the pagination (add the page token to the filter, handle the result of the request and return the next page token) while at the datastore implementation level (e.g postgreSQL/SQL) we just implement a list query:

cadence/common/persistence/sql/sqlplugin/postgres/execution.go

Lines 196 to 215 in 57b7157

func (pdb *db) SelectFromExecutions(ctx context.Context, filter *sqlplugin.ExecutionsFilter) ([]sqlplugin.ExecutionsRow, error) {

dbShardID := sqlplugin.GetDBShardIDFromHistoryShardID(int(filter.ShardID), pdb.GetTotalNumDBShards())

var rows []sqlplugin.ExecutionsRow

var err error

if len(filter.DomainID) == 0 && filter.Size > 0 {

err = pdb.driver.SelectContext(ctx, dbShardID, &rows, listExecutionQuery, filter.ShardID, filter.WorkflowID, filter.Size)

if err != nil {

return nil, err

}

} else {

var row sqlplugin.ExecutionsRow

err = pdb.driver.GetContext(ctx, dbShardID, &row, getExecutionQuery, filter.ShardID, filter.DomainID, filter.WorkflowID, filter.RunID)

if err != nil {

return nil, err

}

rows = append(rows, row)

}

return rows, err

}

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

common/persistence/sql/sqlplugin/postgres/domain_audit_log.go

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

common/persistence/sql/sqlplugin/postgres/domain_audit_log_test.go

common/persistence/sql/sql_domain_audit_store.go

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

gitar-bot · 2026-02-12T05:40:37Z

Code Review ✅ Approved 6 resolved / 6 findings

Well-structured PostgreSQL DomainAudit implementation with good test coverage. Previous findings have been addressed: nil DataBlob now correctly maps to empty bytes for NOT NULL columns, and SelectContext is used instead of GetContext for multi-row queries. The "SchemaUpdateCqlFiles" manifest key is a project-wide convention, not a bug.

✅ 6 resolved

✅ Bug: Schema NOT NULL on state_before/state_after vs nil DataBlob

📄 common/persistence/sql/sql_domain_audit_store.go:183-188 📄 schema/postgres/cadence/versioned/v0.7/domain_audit_log.sql:5-8 📄 schema/postgres/cadence/schema.sql:281-284
The domain_audit_log table defines state_before BYTEA NOT NULL and state_after BYTEA NOT NULL, but the Go code allows these blobs to be nil. In sql_domain_audit_store.go, getDataBlobBytes() returns nil when the DataBlob is nil. When inserting, a Go nil byte slice is sent as SQL NULL, which will violate the NOT NULL constraint.

This is a real scenario: for domain creation operations, StateBefore will be nil (there's no prior state). The domain_audit_manager.go only sets stateBeforeBlob if request.StateBefore != nil, and similarly for stateAfterBlob.

Fix options:
Change the schema to allow NULL: state_before BYTEA (without NOT NULL), or
Change getDataBlobBytes to return an empty byte slice instead of nil when the blob is nil:
func getDataBlobBytes(blob *persistence.DataBlob) []byte {
    if blob == nil {
        return []byte{}
    }
    return blob.Data
}
Option 2 is simpler and doesn't require a schema change. The read path already handles empty byte slices correctly (len(row.StateBefore) > 0 check).

✅ Bug: Test asserts query has no LIMIT but _getDomainAuditLogsQuery has LIMIT 1

📄 common/persistence/sql/sqlplugin/postgres/domain_audit_log_test.go:285 📄 common/persistence/sql/sqlplugin/postgres/domain_audit_log.go:52
The test case "get success with no results" at line 285 asserts assert.NotContains(t, query, "LIMIT"), but the _getDomainAuditLogsQuery used in the PageSize == 0 path clearly contains LIMIT 1 (line 52 of domain_audit_log.go). This assertion will always fail, indicating either:

The test assertion is wrong and should be assert.Contains(t, query, "LIMIT 1"), or

The intent was that the non-paginated path should NOT have a LIMIT clause, which means the _getDomainAuditLogsQuery should not have LIMIT 1.

Since testify/assert (not require) is used, the test will continue executing after this failure rather than stopping, masking the issue.

✅ Bug: GetContext with multi-row query silently drops rows or errors

📄 common/persistence/sql/sqlplugin/postgres/domain_audit_log.go:96-105
When PageSize is 0 (or unset), SelectFromDomainAuditLogs uses pdb.driver.GetContext() with _getDomainAuditLogsQuery, which has no LIMIT clause and can return multiple rows. However, GetContext delegates to sqlx.DB.GetContext, which is designed for single-row results (like QueryRow). If the query returns more than one row, sqlx.Get will return an error like sql: expected 1 destination arguments in Scan, not N. If it returns exactly one row, only that single row is returned, but the intent appears to be returning all matching rows.

This means the "no pagination" path (PageSize == 0) is fundamentally broken — it will either error out or only return a single result when multiple exist.

The fix should use SelectContext (for multi-row results) instead of GetContext, and scan into a slice rather than a single pointer.

✅ Bug: Inconsistent default time calculation for end time

📄 common/persistence/sql/sqlplugin/postgres/domain_audit_log.go:64-65
The default end time is calculated using time.Unix(0, time.Now().UnixNano()) which is equivalent to just time.Now(). However, the default start time is time.Unix(0, 0) (Unix epoch).

Issue: While this works, the code is unnecessarily complex and could be simplified. More importantly, since created_time < $4 is used (exclusive), if a caller doesn't provide a MaxCreatedTime and wants to include all records up to now, they could miss records created at exactly the same nanosecond the query is executed (though this is extremely rare).

Suggested fix: For clarity and consistency, simplify to:
start := time.Time{} // zero time 
end := time.Now().Add(time.Minute) // slight buffer to ensure "now" records are included

if filter.MinCreatedTime != nil {
    start = *filter.MinCreatedTime
}
if filter.MaxCreatedTime != nil {
    end = *filter.MaxCreatedTime
}
Or consider using <= instead of < for the upper bound comparison in the query.

✅ Quality: MySQL domain audit log methods silently return nil

📄 common/persistence/sql/sqlplugin/mysql/domain_audit_log.go:30-43
The MySQL implementation of InsertIntoDomainAuditLog and SelectFromDomainAuditLogs returns nil, nil without any error, which silently fails. This could lead to data loss since callers would believe the operation succeeded when in fact no data was persisted.

Impact: If MySQL is used as the persistence layer and domain audit logs are enabled, the caller will think the audit log was created successfully, but no data will be stored. This is a silent failure mode that could go undetected in production.

Suggested fix: Return an appropriate error indicating the feature is not yet implemented:
func (mdb *DB) InsertIntoDomainAuditLog(ctx context.Context, row *sqlplugin.DomainAuditLogRow) (sql.Result, error) {
    return nil, fmt.Errorf("domain audit log is not implemented for MySQL")
}

func (mdb *DB) SelectFromDomainAuditLogs(ctx context.Context, filter *sqlplugin.DomainAuditLogFilter) ([]*sqlplugin.DomainAuditLogRow, error) {
    return nil, fmt.Errorf("domain audit log is not implemented for MySQL")
}
Alternatively, consider adding a feature flag or capability check that allows callers to verify if domain audit is supported by the persistence layer.

✅ Manifest uses incorrect key name "SchemaUpdateCqlFiles"

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

c-warren

Looking good! Thanks for making those changes.

c-warren · 2026-02-12T22:23:09Z

schema/postgres/cadence/versioned/v0.7/domain_audit_log.sql

+  domain_id               BYTEA NOT NULL,
+  event_id                BYTEA NOT NULL,


I'd err on the side of TEXT here so that the IDs are human readable natively in a databases tooling.

c-warren · 2026-02-12T22:25:38Z

schema/postgres/cadence/versioned/v0.7/domain_audit_log.sql

+  identity                TEXT NOT NULL,
+  identity_type           TEXT NOT NULL,
+  comment                 TEXT NOT NULL DEFAULT '',
+  PRIMARY KEY (domain_id, operation_type, created_time, event_id)


We have a composite primary key in Cassandra because its support for secondary indexes is limited (in our implementation anyway). Because of that we have to stuff everything we want to query by into the primary key. For this implementation I think its fine to keep that behaviour as we will be passing all of these things from the layer above postgres, but I think you could equivalently just use the event_id as the primary key so that it is possible to fetch a row purely by event_id.

I'd also recommend adding secondary indexes on created_time and operation_type so that we can effectively filter by those records when building other views (e.g the most recent updates to visibility for example).

c-warren · 2026-02-12T22:37:49Z

common/persistence/sql/sqlplugin/interfaces.go

+		DomainID            serialization.UUID
+		EventID             serialization.UUID
+		StateBefore         []byte
+		StateBeforeEncoding string


We have an encoding type in commonconsts which we should use here - we want to ensure that we're using a valid/supported encoding.

c-warren · 2026-02-12T22:37:54Z

common/persistence/sql/sqlplugin/interfaces.go

+		StateBefore         []byte
+		StateBeforeEncoding string
+		StateAfter          []byte
+		StateAfterEncoding  string


c-warren · 2026-02-12T22:43:57Z

common/persistence/sql/sqlplugin/postgres/domain_audit_log_test.go

+	domainID := serialization.MustParseUUID("d1111111-1111-1111-1111-111111111111")
+	eventID := serialization.MustParseUUID("e1111111-1111-1111-1111-111111111111")


NIT: I recommend creating a unique UUID whenever the test is run as a best practice. It prevents hardcoding/reusing expected UUIDs and ensures the code path can't be affected by any other tests. In this particular test it doesn't matter much, but something to keep in mind.

c-warren · 2026-02-12T22:47:15Z

common/persistence/sql/sql_domain_audit_store.go

+		auditLog := &persistence.InternalDomainAuditLog{
+			EventID:         row.EventID.String(),
+			DomainID:        row.DomainID.String(),
+			OperationType:   row.OperationType,
+			CreatedTime:     row.CreatedTime,
+			LastUpdatedTime: row.LastUpdatedTime,
+			Identity:        row.Identity,
+			IdentityType:    row.IdentityType,
+			Comment:         row.Comment,
+		}
+
+		if len(row.StateBefore) > 0 {
+			auditLog.StateBefore = &persistence.DataBlob{
+				Encoding: constants.EncodingType(row.StateBeforeEncoding),
+				Data:     row.StateBefore,
+			}
+		}
+
+		if len(row.StateAfter) > 0 {
+			auditLog.StateAfter = &persistence.DataBlob{
+				Encoding: constants.EncodingType(row.StateAfterEncoding),
+				Data:     row.StateAfter,
+			}
+		}


This would be handy to have as a common deserialization function for whenever we add GetAuditLog(event_id) for example. It'll also be easier to test deserialization from a PostgreSQL row and any error handling we have.

joannalauu added 2 commits January 25, 2026 17:55

Add domain_audit_log table to Postgres schema

465fb75

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

gitar-bot bot reviewed Feb 2, 2026

View reviewed changes

common/persistence/sql/sqlplugin/mysql/domain_audit_log.go Show resolved Hide resolved

Change domain_id and event_id schema type from UUID to BYTEA

af86213

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

natemort assigned c-warren Feb 2, 2026

Shaddoll reviewed Feb 3, 2026

View reviewed changes

fix: implement error handling in mysql domain audit log

e5a751c

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

joannalauu force-pushed the feat/domainaudit-postgres branch from 170adae to ab40f58 Compare February 11, 2026 06:14

gitar-bot bot reviewed Feb 11, 2026

View reviewed changes

common/persistence/sql/sqlplugin/postgres/domain_audit_log.go Outdated Show resolved Hide resolved

joannalauu added 2 commits February 11, 2026 10:53

Move pagination logic to sql store

96640f5

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

Fix whitespace

5dcbfe1

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

joannalauu force-pushed the feat/domainaudit-postgres branch from ab40f58 to 5dcbfe1 Compare February 11, 2026 15:54

gitar-bot bot reviewed Feb 11, 2026

View reviewed changes

common/persistence/sql/sqlplugin/postgres/domain_audit_log_test.go Outdated Show resolved Hide resolved

gitar-bot bot reviewed Feb 11, 2026

View reviewed changes

common/persistence/sql/sql_domain_audit_store.go Show resolved Hide resolved

Return all rows if PageSize == 0

69bc8c0

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

joannalauu force-pushed the feat/domainaudit-postgres branch 2 times, most recently from 09ac487 to 2c6171e Compare February 11, 2026 23:30

Fix nil data blob

d567cd1

Signed-off-by: Joanna Lau <118241363+joannalauu@users.noreply.github.com>

joannalauu force-pushed the feat/domainaudit-postgres branch from 2c6171e to d567cd1 Compare February 12, 2026 05:35

joannalauu requested a review from Shaddoll February 12, 2026 06:29

c-warren reviewed Feb 12, 2026

View reviewed changes

	func (m *sqlExecutionStore) ListConcreteExecutions(
	ctx context.Context,
	request *p.ListConcreteExecutionsRequest,
	) (*p.InternalListConcreteExecutionsResponse, error) {

	filter := &sqlplugin.ExecutionsFilter{}
	if len(request.PageToken) > 0 {
	err := gobDeserialize(request.PageToken, &filter)
	if err != nil {
	return nil, &types.InternalServiceError{
	Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),
	}
	}
	} else {
	filter = &sqlplugin.ExecutionsFilter{
	ShardID: m.shardID,
	WorkflowID: "",
	}
	}
	filter.Size = request.PageSize

	executions, err := m.db.SelectFromExecutions(ctx, filter)
	if err != nil {
	if err == sql.ErrNoRows {
	return &p.InternalListConcreteExecutionsResponse{}, nil
	}
	return nil, convertCommonErrors(m.db, "ListConcreteExecutions", "", err)
	}

	if len(executions) == 0 {
	return &p.InternalListConcreteExecutionsResponse{}, nil
	}
	lastExecution := executions[len(executions)-1]
	nextFilter := &sqlplugin.ExecutionsFilter{
	ShardID: m.shardID,
	WorkflowID: lastExecution.WorkflowID,
	}
	token, err := gobSerialize(nextFilter)
	if err != nil {
	return nil, &types.InternalServiceError{
	Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),
	}
	}
	concreteExecutions, err := m.populateInternalListConcreteExecutions(executions)
	if err != nil {
	return nil, &types.InternalServiceError{
	Message: fmt.Sprintf("ListConcreteExecutions failed. Error: %v", err),
	}
	}

	return &p.InternalListConcreteExecutionsResponse{
	Executions: concreteExecutions,
	NextPageToken: token,
	}, nil
	}

	func (pdb db) SelectFromExecutions(ctx context.Context, filter sqlplugin.ExecutionsFilter) ([]sqlplugin.ExecutionsRow, error) {
	dbShardID := sqlplugin.GetDBShardIDFromHistoryShardID(int(filter.ShardID), pdb.GetTotalNumDBShards())
	var rows []sqlplugin.ExecutionsRow
	var err error
	if len(filter.DomainID) == 0 && filter.Size > 0 {
	err = pdb.driver.SelectContext(ctx, dbShardID, &rows, listExecutionQuery, filter.ShardID, filter.WorkflowID, filter.Size)
	if err != nil {
	return nil, err
	}
	} else {
	var row sqlplugin.ExecutionsRow
	err = pdb.driver.GetContext(ctx, dbShardID, &row, getExecutionQuery, filter.ShardID, filter.DomainID, filter.WorkflowID, filter.RunID)
	if err != nil {
	return nil, err
	}
	rows = append(rows, row)
	}

	return rows, err
	}

		domainID := serialization.MustParseUUID("d1111111-1111-1111-1111-111111111111")
		eventID := serialization.MustParseUUID("e1111111-1111-1111-1111-111111111111")

feat: PostgreSQL support for DomainAudit #7665

Are you sure you want to change the base?

feat: PostgreSQL support for DomainAudit #7665

Conversation

joannalauu commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Potential Breaking change

Reviewer Validation

Uh oh!

Uh oh!

c-warren commented Feb 3, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gitar-bot bot commented Feb 12, 2026

Uh oh!

c-warren left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joannalauu commented Feb 2, 2026 •

edited

Loading