Skip to content

Zigbee Lock: Support SLGA Migration#2939

Open
hcarter-775 wants to merge 7 commits intomainfrom
re-structure/feature/chad-16364
Open

Zigbee Lock: Support SLGA Migration#2939
hcarter-775 wants to merge 7 commits intomainfrom
re-structure/feature/chad-16364

Conversation

@hcarter-775
Copy link
Copy Markdown
Contributor

@hcarter-775 hcarter-775 commented May 1, 2026

Description of Change

Follow-up to #2937, where I've responded to some comments that myself and others have left in order to clarify certain functionality.

Note on structure: The legacy-handlers subdriver keeps some subdrivers that have specific non-conformant behavior that relates specifically to lockCodes. This includes lock-without-codes, which no longer needs to exist in the modern implementation, as well as the yale-fingerprint-lock subdriver, which sets the lockCodes maxCodes state. Similarly, the yale subdriver has lockCodes-specific behavior as well.

However, upon analysis it was clear that the sub-sub-driver yale-bad-battery-reporter subdriver was 1. not unique to yale, and 2. was distinct from any lockCodes behavior, so it was pulled out into the now default subdrivers. Also, the samsungds subdriver had minimal differences between the old and new handling, so I condensed them.

The other primary thing that was removed from the now-legacy-handlers subdriver is the old added handler, for the following 2 reasons. 1. Generally, devices should not be added to this driver any longer, and 2. We now use the added main driver implementation to initially set the migrated keyword to true.

I find the fact that the entire migrated state depends on a keyword set only once in added to be a little precarious, but nothing stands out to me initially at this not working.

Summary of Completed Tests

Migration process, plus lockUsers/lockCredentials functionality, has been tested on-device with two different kwikset locks.

Unit tests added (I renamed some of the previous tests to be _legacy, to clarify that they are testing pre-migration functionality.

cbaumler and others added 2 commits May 1, 2026 09:33
First, I had AI do the following:
- Rename subdrivers: using-old-capabilities -> legacy-handlers,
  promote using-new-capabilities handlers to top-level defaults
- Make new capability handling the driver default; legacy-handlers
  subdriver overrides for pre-migration devices only
- Top-level vendor subdrivers (yale, samsungsds, yale-fingerprint-lock)
  now only activate for migrated devices to avoid double-dispatch
- Remove duplicate lock-without-codes top-level subdriver (handled
  entirely within legacy-handlers since these devices never migrate)
- Add init lifecycle handler to legacy-handlers to populate lockCodes
  state for pre-migration devices on driver restart

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Then, I cleaned up the driver further by fixing the git understanding of
what files were made in what year, and by introducing small changes to
altogether remove some sub-subdrivers from the legacy subdriver, and vice versa.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Duplicate profile check: Passed - no duplicate profiles detected.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Test Results

   72 files    510 suites   0s ⏱️
2 836 tests 2 836 ✅ 0 💤 0 ❌
4 702 runs  4 702 ✅ 0 💤 0 ❌

Results for commit d74f6a5.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

File Coverage
All files 89%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/init.lua 78%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/zigbee_lock_utils.lua 90%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/lazy_load_subdriver.lua 57%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/configurations.lua 95%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/samsungsds/init.lua 94%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/legacy-handlers/init.lua 95%
/home/runner/work/SmartThingsEdgeDrivers/SmartThingsEdgeDrivers/drivers/SmartThings/zigbee-lock/src/legacy-handlers/yale/init.lua 92%

Minimum allowed coverage is 90%

Generated by 🐒 cobertura-action against d74f6a5

local init = function(driver, device)
lock_utils.reload_tables(device)
device.thread:call_with_delay(15, function(d)
reload_all_codes(device)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to reload all codes every time or can we put a guard on this so it's only done in the cases where it's needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that all of the actual sends in this function are gated on if the cache is actually empty, so I think that's effectively done.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like SendPINOverTheAir and GetPINCode are gated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, my bad. Seems like this was 1-1 copied from the current Zigbee Lock handling (which is now in legacy-handlers), and the capability cache checks have just been moved to the new capabilities.


if (credential_index == device:get_field(lock_utils.CHECKING_CODE)) then
-- the credential we're checking has arrived
local last_slot = device:get_latest_state("main", capabilities.lockCredentials.ID,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed to have pinUsersSupported state at this point or do we need to handle the case where it's not set yet?

device:emit_event(event)
end
local function device_added(driver, device)
if device:supports_capability_by_id(capabilities.lockCodes.ID) and device._provisioning_state == "TYPED" then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This provisioning state typed check should only not be true for devices that are driver switched, do we need to consider the driver switch case wrt migration? I assume this would rarely happen and it it did, it would be a user moving from a custom driver to ours, in which case IMO we can ask them to re-onboard if anything goes wrong.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW in the matter lock driver, we just have a separate handler for the driverSwitched event that mirrors do_configure, If we want to address the driver switch case, I think we could add handling for the driver switch case by encapsulating the configuration logic in a helper function and just calling that same function from the doConfigure and the driverSwitched handler.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO driver switching locks should be treated like a LAN device, and basically not be supported officially. Should anyone try to do it and face issues, the solution is re-onboarding.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my thinking. Agreed that we could add a driverSwitched handler if we wanted to support that, but it should not match the normal user experience.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, fyi the added handler shouldn't really be related to the SWITCHED state.

device:emit_event(LockCodes.maxCodeLength(value.value, { visibility = { displayed = false } }))
local add_user_handler = function(driver, device, command)
if lock_utils.busy_check_and_set(device, {name = lock_utils.ADD_USER, type = lock_utils.LOCK_USERS}) then
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this happens, whats the user experience? Does the app just time out?

device:emit_event(LockCredentials.minPinCodeLen(min_code_len, { visibility = { displayed = false } }))
device:emit_event(LockCredentials.maxPinCodeLen(max_code_len, { visibility = { displayed = false } }))
device:emit_event(LockCredentials.pinUsersSupported(max_codes, { visibility = { displayed = false } }))
device:emit_event(LockCredentials.credentials(lock_credentials, { visibility = { displayed = false } }))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider setting the credential and user tables on the device at this point. I think everything should work since the tables are reloaded when accessed, but its a bit fragile and potentially prone to race conditions to rely on the state cache to be immediately updated.

SLGA_MIGRATED = "slgaMigrated"
}

lock_utils.DELETE_ALL_USERS = 0xFF
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overrides what was previously set ("deleteAllUsers"). I think everything will still work as intended (just a weird command status name during the delete operation, but it does look like a potential bug, and is misleading. I think you should a new constant for the protocol credential index special value that means all users.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think we will emit an event for command status with 0xFF as the command name.

lock_utils.send_events(device)
end
-- ignore handling the busy state for these commands, they are handled within their own handlers
if command ~= nil and command ~= lock_utils.DELETE_ALL_CREDENTIALS and command ~= lock_utils.DELETE_ALL_USERS then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These command ~= ... checks will always be true since command is a table.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi I've decided to make some updates to all the command name stuff, so I'd hold off on reviewing this for the moment.

local user_index = tonumber(command.args.userIndex)
local user_type = command.args.userType
local credential_type = command.args.credentialType
local credential_data = command.args.credentialData
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also extract credentialName, and include that in the ACTIVE_CREDENTIAL field, so the data is not lost when we emit an event. Same goes for the other places we set ACTIVE_CREDENTIAL in update_credential_handler and delete_credential_handler, and then the lock_utils.add_credential function would need to consider it as well (I dont think I missed any...)

end
event.data = {method = METHOD[0], codeId = code_id .. "", codeName = code_name}

event.data = { method = METHOD[0], codeId = code_id .. "", codeName = code_name }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the old event data, and should include the new lock capability fields. The full list of fields in the lock attribute are:

  • codeId
  • codeName
  • userIndex
  • userName
  • userType
  • timeout
  • unlockDirection

I am not sure what the plugin expects for these events, but the requirements say that unlock events should include the user information, and the codeId/codeName are legacy fields.

lock_codes[code_id] ~= nil) then
code_name = lock_codes[code_id]
local code_name = "Code " .. code_id
local user = lock_utils.get_user(device, code_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zigbee message's user_id maps to our credential capability not the users capability.

Suggested change
local user = lock_utils.get_user(device, code_id)
local credential = lock_utils.get_credential(device, code_id)
local user = lock_utils.get_user(device, credential.userIndex)

device.thread:call_with_delay(delay + 2, function(d)
device:send(LockCluster.server.commands.GetPINCode(device, credential_index))
end)
delay = delay + 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are more than 5 credentials being deleted, the "busy state" will be cleared prior to the command being completed.

We could consider upping the busy state timeout to be 30 or 60 seconds, but it comes with the possibility that the user will see unexplained failures if they try to deal with any failure situations by issuing more commands within that window of the initial failed command.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar timing situation occurs with delete_all_users

device:send(LockCluster.attributes.NumberOfPINUsersSupported:read(device))
end
if (device:get_latest_state("main", capabilities.lockUsers.ID, capabilities.lockUsers.totalUsersSupported.NAME) == nil) then
device:send(LockCluster.attributes.NumberOfTotalUsersSupported:read(device))
Copy link
Copy Markdown
Contributor

@cjswedes cjswedes May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is needed. 1 there is no handler for the attribute, and 2 it has no bearing on our credentials/user capabilities. I think the total number of users supported is equal to the total number of pin users supported, although I am not entirely sure how the legacy driver supported RFID credentials on zigbee locks...the max users according to the spec is the larger of NumberOfPINUsersSupported or NumberOfRFIDUsersSupported

-- Licensed under the Apache License, Version 2.0

return function(opts, driver, device)
local fingerprints = require("yale-fingerprint-lock.fingerprints")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be

Suggested change
local fingerprints = require("yale-fingerprint-lock.fingerprints")
local fingerprints = require("legacy-handlers.yale-fingerprint-lock.fingerprints")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, it is correctly pointed. The same fingerprints accessing this legacy subdriver should access the updated subdriver.

if device.data.lockCodes ~= nil and device:get_field(legacy_lock_utils.MIGRATION_COMPLETE) ~= true then
-- build the lockCodes table
local lockCodes = {}
local lc_data = json.decode(device.data.lockCodes)
Copy link
Copy Markdown
Contributor

@nickolas-deboom nickolas-deboom May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a nil check for device.data.lockCodes? in case migration is incomplete or something

device:set_field(lock_utils.CHECKING_CODE, checkingCode)
device:send(LockCluster.server.commands.GetPINCode(device, checkingCode))
end
driver:inject_capability_command(device, {
Copy link
Copy Markdown
Contributor

@ctowns ctowns May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why this is done via injecting the refresh command rather than just calling a separate function? If a delay is necessary, a timer could be used. This might just be something historical from the old driver code, I just didn't know the reason why it was done this way and was curious.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants