Skip to content

Fix invalid database entries with all null sensor fields #95

@JulianKahnert

Description

@JulianKahnert

Problem Description

The database contains entries where all sensor field columns are null. This occurs when HomeKit characteristic reading fails but the fallback EntityStorageItem(entityId: entityId) is still written to the database. These invalid entries pollute the database with meaningless data and should be rejected with a critical error logged instead.

Root Cause

In Sources/Adapter/Extensions/HMCharacteristic.swift line 86, when an error occurs while reading HomeKit characteristics, the code creates an empty EntityStorageItem with only the entityId set:

} catch {
    // this might occur when e.g. the IKEA hub or a device is not available
    homeKitLogger.critical("Error while getting characteristic data - \(self.service?.accessory?.room?.name ?? "")@\(self.service?.accessory?.name ?? "") - \(self)\n\(error)")
    return EntityStorageItem(entityId: entityId)  // <- Returns item with all sensor fields nil
}

This empty item flows through:

  1. HomeKitAdapter+Delegates.swift line 195-198 yields it to the entity stream
  2. EventProcessingJob.swift line 26 passes it to homeManager.addEntityHistory(item)
  3. HomeManager.swift line 136 calls storageRepo.add(item)
  4. EntityStorageDbRepository.swift line 63 writes it to the database without validation

Current Behavior

  • Empty EntityStorageItem objects (all sensor fields nil) are written to the database
  • These entries waste storage space
  • They appear in history queries but provide no useful information
  • No validation prevents this invalid data from being persisted

Expected Behavior

  • EntityStorageItem objects with all sensor fields nil should be rejected before database write
  • A critical error should be logged explaining why the item was rejected
  • The database should only contain valid sensor readings

Implementation Plan

Overview

Add validation at the database repository layer to reject EntityStorageItem objects that have all sensor fields set to nil. This ensures invalid data never reaches the database while maintaining clear error logging.

Affected Components

  1. Sources/HAModels/EntityStorageItem.swift - Add validation method
  2. Sources/Server/Models/EntityStorageDbRepository.swift - Add validation before database write
  3. Tests/ - Add unit tests for validation logic

Implementation Steps

Step 1: Add Validation Method to EntityStorageItem

Add a computed property to EntityStorageItem that checks if all sensor fields are nil:

// In Sources/HAModels/EntityStorageItem.swift

public extension EntityStorageItem {
    /// Returns true if all sensor fields are nil, indicating an invalid/empty storage item
    var hasNoSensorData: Bool {
        return motionDetected == nil &&
               _illuminanceInLux == nil &&
               isDeviceOn == nil &&
               brightness == nil &&
               colorTemperature == nil &&
               color == nil &&
               isContactOpen == nil &&
               isDoorLocked == nil &&
               stateOfCharge == nil &&
               isHeaterActive == nil &&
               _temperatureInC == nil &&
               relativeHumidity == nil &&
               carbonDioxideSensorId == nil &&
               pmDensity == nil &&
               airQuality == nil &&
               valveOpen == nil
    }
}

Rationale: Placing the validation logic in the model keeps the business rule close to the data structure and makes it reusable.

Step 2: Add Validation in EntityStorageDbRepository

Update the add method in EntityStorageDbRepository to validate before saving:

// In Sources/Server/Models/EntityStorageDbRepository.swift

func add(_ item: EntityStorageItem) async throws {
    // Validate that the item contains at least one sensor value
    guard !item.hasNoSensorData else {
        log.critical("Rejected EntityStorageItem with all nil sensor fields - entityId: \(item.entityId), timestamp: \(item.timestamp)")
        // Throw error to propagate validation failure
        throw Abort(.badRequest, reason: "Cannot store entity history item with all nil sensor fields for entityId: \(item.entityId)")
    }
    
    try await Self.map(item).save(on: database)
}

Rationale:

  • The repository layer is the appropriate place for data validation before persistence
  • Using .critical log level ensures visibility in logs
  • Throwing an error allows upstream code to handle validation failures appropriately
  • The error message includes entityId for debugging

Step 3: Handle Validation Error in HomeManager

Update HomeManager.addEntityHistory to handle validation errors gracefully:

// In Sources/HAApplicationLayer/HomeManager.swift

public func addEntityHistory(_ item: EntityStorageItem) async {
    log.debug("Adding entity item to storage \(item.entityId)")
    await entityCache.insert(item, forKey: item.entityId)

    // Persist item in the background to avoid blocking automation execution
    Task.detached(priority: .background) {
        do {
            if var currentItem = try await self.storageRepo.getCurrent(item.entityId) {

                // found current item, save it when a change has happened
                // we want to exclude the timestamp from the equality comparison, so change the timestamp temporarily
                currentItem.timestamp = item.timestamp
                guard item != currentItem else { return }

                try await self.storageRepo.add(item)
            } else {
                // no current item found add it to the store directly
                try await self.storageRepo.add(item)
            }
        } catch let error as AbortError where error.status == .badRequest {
            // Validation error - log at debug level since this is expected for empty items
            self.log.debug("Skipping invalid entity item: \(error.reason)")
        } catch {
            self.log.critical("Failed to persist entity item \(error)")
            assertionFailure()
        }
    }
}

Rationale:

  • Catch the validation error specifically to avoid triggering assertionFailure()
  • Use debug level for validation errors since they're expected (not system failures)
  • Preserve existing error handling for genuine persistence failures

Step 4: Consider Optional Improvement to Source (HMCharacteristic)

Optional: Instead of creating empty EntityStorageItem objects, return nil when reading fails:

// In Sources/Adapter/Extensions/HMCharacteristic.swift line 83-87

} catch {
    // this might occur when e.g. the IKEA hub or a device is not available
    homeKitLogger.critical("Error while getting characteristic data - \(self.service?.accessory?.room?.name ?? "")@\(self.service?.accessory?.name ?? "") - \(self)\n\(error)")
    return nil  // Changed from: return EntityStorageItem(entityId: entityId)
}

This prevents empty items from being created in the first place. The nil return is already handled by:

  • Line 106 in HomeKitAdapter+Delegates.swift: guard let item else { return nil }
  • Line 115: items.compactMap(\.self) filters out nils
  • Line 197: guard let item else { return } prevents yielding nil items

Rationale: Defense in depth - prevent invalid data at the source AND validate at persistence layer.

Decision: Implement this optional improvement to prevent unnecessary processing of invalid items through the pipeline.

Step 5: Add Unit Tests

Create tests in Tests/HomeAutomationKitTests/EntityStorageItemTests.swift:

import XCTest
@testable import HAModels

final class EntityStorageItemTests: XCTestCase {
    func testHasNoSensorData_AllNil_ReturnsTrue() {
        let entityId = EntityId(placeId: "Living Room", name: "Light", characteristicsName: nil, characteristic: .powerSensor)
        let item = EntityStorageItem(entityId: entityId)
        
        XCTAssertTrue(item.hasNoSensorData)
    }
    
    func testHasNoSensorData_WithMotionDetected_ReturnsFalse() {
        let entityId = EntityId(placeId: "Living Room", name: "Motion", characteristicsName: nil, characteristic: .motionSensor)
        let item = EntityStorageItem(entityId: entityId, motionDetected: true)
        
        XCTAssertFalse(item.hasNoSensorData)
    }
    
    func testHasNoSensorData_WithTemperature_ReturnsFalse() {
        let entityId = EntityId(placeId: "Living Room", name: "Sensor", characteristicsName: nil, characteristic: .temperatureSensor)
        let item = EntityStorageItem(entityId: entityId, temperatureInC: Measurement(value: 22.5, unit: .celsius))
        
        XCTAssertFalse(item.hasNoSensorData)
    }
    
    func testHasNoSensorData_WithBrightness_ReturnsFalse() {
        let entityId = EntityId(placeId: "Living Room", name: "Light", characteristicsName: nil, characteristic: .brightnessSensor)
        let item = EntityStorageItem(entityId: entityId, brightness: 75)
        
        XCTAssertFalse(item.hasNoSensorData)
    }
    
    func testHasNoSensorData_WithColor_ReturnsFalse() {
        let entityId = EntityId(placeId: "Living Room", name: "Light", characteristicsName: nil, characteristic: .colorSensor)
        let color = RGB(red: 1.0, green: 0.5, blue: 0.3)
        let item = EntityStorageItem(entityId: entityId, color: color)
        
        XCTAssertFalse(item.hasNoSensorData)
    }
}

Create integration tests in Tests/HomeAutomationKitTests/EntityStorageDbRepositoryTests.swift:

import XCTest
import Fluent
import FluentSQLiteDriver
@testable import Server
@testable import HAModels

final class EntityStorageDbRepositoryTests: XCTestCase {
    var database: Database!
    var repository: EntityStorageDbRepository!
    
    override func setUp() async throws {
        let app = Application(.testing)
        app.databases.use(.sqlite(.memory), as: .sqlite)
        
        // Run migrations
        app.migrations.add(CreateEntityStorageDbItem())
        app.migrations.add(AddSensorFields())
        try await app.autoMigrate()
        
        database = app.db
        repository = EntityStorageDbRepository(database: database)
    }
    
    override func tearDown() async throws {
        try await database.shutdown()
    }
    
    func testAdd_ItemWithAllNilSensorFields_ThrowsError() async throws {
        let entityId = EntityId(placeId: "Living Room", name: "Light", characteristicsName: nil, characteristic: .powerSensor)
        let emptyItem = EntityStorageItem(entityId: entityId)
        
        do {
            try await repository.add(emptyItem)
            XCTFail("Expected error to be thrown for item with all nil sensor fields")
        } catch let error as AbortError {
            XCTAssertEqual(error.status, .badRequest)
            XCTAssertTrue(error.reason.contains("all nil sensor fields"))
        } catch {
            XCTFail("Expected AbortError, got \(error)")
        }
    }
    
    func testAdd_ItemWithSensorData_Succeeds() async throws {
        let entityId = EntityId(placeId: "Living Room", name: "Motion", characteristicsName: nil, characteristic: .motionSensor)
        let validItem = EntityStorageItem(entityId: entityId, motionDetected: true)
        
        // Should not throw
        try await repository.add(validItem)
        
        // Verify it was saved
        let retrieved = try await repository.getCurrent(entityId)
        XCTAssertNotNil(retrieved)
        XCTAssertEqual(retrieved?.motionDetected, true)
    }
}

Rationale: Comprehensive tests ensure the validation logic works correctly for all sensor types and edge cases.

Step 6: Update Existing Database (Migration Strategy)

Decision: No database migration needed.

Rationale:

  • Existing null entries in the database are historical data and can remain
  • New validation only affects future writes
  • If cleanup is desired, it can be done manually with a one-time SQL query:
    DELETE FROM entityItems WHERE 
      motionDetected IS NULL AND
      illuminanceInLux IS NULL AND
      isDeviceOn IS NULL AND
      brightness IS NULL AND
      colorTemperature IS NULL AND
      colorRed IS NULL AND
      colorGreen IS NULL AND
      colorBlue IS NULL AND
      isContactOpen IS NULL AND
      isDoorLocked IS NULL AND
      stateOfCharge IS NULL AND
      isHeaterActive IS NULL AND
      temperatureInC IS NULL AND
      relativeHumidity IS NULL AND
      carbonDioxideSensorId IS NULL AND
      pmDensity IS NULL AND
      airQuality IS NULL AND
      valveOpen IS NULL;

Testing Strategy

Unit Tests

  1. Test hasNoSensorData returns true for empty EntityStorageItem
  2. Test hasNoSensorData returns false for each sensor field type
  3. Test repository validation throws error for empty items
  4. Test repository validation allows items with sensor data

Integration Tests

  1. Test end-to-end flow with HomeManager rejecting empty items
  2. Test that valid items still persist correctly
  3. Test error logging produces expected messages

Manual Testing

  1. Disconnect a HomeKit device and verify:
    • Critical error logged at characteristic reading (existing behavior)
    • Debug message logged when validation rejects empty item (new behavior)
    • No new database entries created (new behavior)
  2. Reconnect device and verify valid data is stored
  3. Check logs for validation rejection messages

Build Verification

Before committing, run:

# Swift package
swift build

# Run tests
swift test

# FlowKitAdapter app
cd Apps/FlowKitAdapter
xcodebuild -project "FlowKit Adapter.xcodeproj" -scheme "FlowKit Adapter" -sdk iphonesimulator -configuration Debug build

# SwiftLint
swiftlint --fix

Potential Risks

  1. Performance Impact: Minimal - validation is a simple nil check on 16 fields
  2. Breaking Changes: None - only affects invalid data that shouldn't be stored anyway
  3. Logging Volume: Critical logs for empty items might increase temporarily if many devices are offline
  4. Testing Gaps: Need to verify all 16 sensor field types in tests

Edge Cases Handled

  1. All fields nil: Rejected with error
  2. Single field non-nil: Accepted and stored
  3. Multiple fields non-nil: Accepted and stored
  4. Error during validation: Logged and propagated
  5. Offline devices: Empty items rejected, critical log at source preserved

Future Considerations

After this fix is implemented and validated, consider:

  1. Adding database index on sensor fields for query performance
  2. Adding metrics to track validation rejection rate
  3. Implementing automatic cleanup job for existing invalid entries
  4. Adding API endpoint to query statistics about invalid data in the database

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions