You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/swift-agent/design.md
+68-45Lines changed: 68 additions & 45 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,8 +28,8 @@ The S3 agent approach does not work for OpenStack Swift because Swift does not s
28
28
29
29
| ID | Requirement | Target | Rationale |
30
30
|----|-------------|--------|-----------|
31
-
| N1 | Event capture latency | < 10 seconds | Chorus replicates ashynchronuosly. Data copy takes time. Capture latency is tolerable when it is small compared to obj copy time|
32
-
| N2 | Minimal impact on Swift hot path | No added latency to user requests, agent failure not leads to swift failure | Production safety |
31
+
| N1 | Event capture latency | < 10 seconds | Chorus replicates asynchronously. Data copy takes time. Capture latency is tolerable when it is small compared to object copy time |
32
+
| N2 | Minimal impact on Swift hot path | No added latency to user requests, agent failure does not lead to Swift failure | Production safety |
33
33
| N3 | Deployment without Swift source modification | Preferred | Operational simplicity |
34
34
| N4 | Support Kubernetes deployment | Required | Primary deployment model |
35
35
| N5 | Support non-containerized deployment | Should | Customer flexibility |
@@ -40,7 +40,7 @@ The S3 agent approach does not work for OpenStack Swift because Swift does not s
40
40
Chorus uses a policy-based replication model:
41
41
42
42
1. User creates replication via Chorus API specifying: `user`, `from_storage`, `to_storage`, optionally `from_bucket`/`to_bucket`
43
-
2. Replication policy stored in Redis (`pkg/store/replication_stores.go`)
43
+
2. Replication policy stored in Redis
44
44
3. Agent receives events and queries policy service to determine if event matches active replication
45
45
4. For matching events, agent creates tasks in work queue
46
46
5. Worker processes tasks idempotently: compares source/destination, copies if needed. It is tolerable to duplicated/reordered tasks.
Swift logs all requests via the [`proxy_logging` middleware](https://docs.openstack.org/swift/latest/logs.html). Log format is configurable via `log_msg_template` parameter (since Swift 2.22.0).
119
+
Swift logs all requests via the [`proxy_logging` middleware](https://docs.openstack.org/swift/latest/logs.html). Log format is configurable via `log_msg_template` parameter.
- Log format is not fixed. Each Swift installation may configure different templates, requiring configurable parsing.
141
-
- Log does not contain `Swift method`, only HTTP method and path. Swift method (Object/Container create/update/metadata-update/delete) have to be calculated.
141
+
- Log does not contain Swift method, only HTTP method and path. Swift method (Object/Container create/update/metadata-update/delete) must be derived from HTTP method + path.
142
142
143
143
### 2.3 Extensibility: Ceph RGW
144
144
@@ -159,8 +159,8 @@ A log-based approach extends to Ceph RGW, which provides [ops logging](https://d
TODO: add mermaid digram with agent sending webhook to chorus
212
210
213
211
**Alternatives:**
214
212
@@ -239,12 +237,6 @@ This mapping logic could be described in a DSL, or using regexes, but would be c
239
237
240
238
**Recommendation**: The agent uses predefined **source types** that encapsulate the mapping logic for each storage vendor. Users configure only the log parsing; the event classification (method + path → event type) is hardcoded per source.
241
239
242
-
**Rationale**:
243
-
- Mapping logic is well-defined per vendor (e.g., Swift PUT + 4 path segments = ObjectCreated)
244
-
- Users shouldn't need to understand or configure this mapping
245
-
- Keeps configuration simple; complex logic stays in testable Go code
246
-
- Trade-off: adding new vendor requires code change, but this is infrequent and ensures correctness
247
-
248
240
```yaml
249
241
swift_agent:
250
242
source: openstack_swift
@@ -260,20 +252,27 @@ swift_agent:
260
252
config: {} # not needed for RGW JSON logs
261
253
```
262
254
255
+
**Rationale**:
256
+
- Mapping logic is well-defined per vendor (e.g., Swift PUT + 4 path segments = ObjectCreated)
257
+
- Users shouldn't need to understand or configure this mapping
258
+
- Keeps configuration simple; complex logic stays in testable Go code
259
+
- Trade-off: adding new vendor requires code change, but this is infrequent and ensures correctness
260
+
263
261
### 5.3 Alternative: parse logs with Fluent Bit
264
262
265
263
Use Fluent Bit for log tailing and parsing, then send parsed events to Chorus agent via HTTP.
266
264
Fluent Bit allows to use Lua scripts or regex parsers to extract needed fields and map to Swift method.
267
265
268
-
Here is example Fluent Bit config for Swift logs:
266
+
---
267
+
Below is example Fluent Bit config for Swift logs:
269
268
270
269
<details>
271
270
272
271
<summary>Fluent Bit config + Lua script</summary>
273
272
273
+
274
274
> [!WARNING]
275
-
> Config and script are illustrative only. It was generated using AI.
276
-
> Full implementation and testing is needed.
275
+
> Config and script are illustrative only. It was generated using AI. Full implementation and testing is needed.
277
276
278
277
279
278
```ini
@@ -310,34 +309,50 @@ local BATCH_SIZE = 10
310
309
functionmap_record(tag, ts, record)
311
310
localpath=record["path"]
312
311
localmethod=record["method"]
312
+
localstatus=tonumber(record["status"])
313
+
314
+
-- Filter: only process successful requests (2xx)
315
+
ifnotstatusorstatus<200orstatus>=300then
316
+
return-1, ts, record
317
+
end
318
+
319
+
ifnotpathornotmethodthen
320
+
return-1, ts, record
321
+
end
313
322
314
-
if not path then
323
+
-- Filter: only mutations (ignore GET, HEAD)
324
+
ifmethod=="GET" ormethod=="HEAD" then
315
325
return-1, ts, record
316
326
end
317
327
318
-
-- /v1/AUTH_x/container/object
328
+
--Parse path: /v1/AUTH_x/container/object
319
329
localparts= {}
320
330
forpinstring.gmatch(path, "[^/]+") do
321
331
table.insert(parts, p)
322
332
end
323
333
324
-
local account = parts[2]
334
+
localaccount=parts[2]-- AUTH_xxx
325
335
localcontainer=parts[3]
326
-
local object = parts[4]
336
+
localobject=parts[4] -- may be nil for container ops
337
+
338
+
-- Extract account ID from AUTH_xxx prefix
339
+
ifaccountandstring.sub(account, 1, 5) =="AUTH_" then
340
+
account=string.sub(account, 6)
341
+
end
327
342
328
343
localop=nil
329
344
330
345
ifobjectthen
331
-
if method == "PUT" then op = "PutObject"
332
-
elseif method == "GET" then op = "GetObject"
333
-
elseif method == "HEAD" then op = "HeadObject"
334
-
elseif method == "DELETE" then op = "DeleteObject"
0 commit comments