ZWave API to network information



I’m looking to build a low-skill installation toolkit, so I need network comms information to provide hints to installers / users about when they’re installing hardware in poor locations / locations have changed their communications capabilities.

I’ve had a scout around the ZWave documentation, and unlike other radio based communications protocols, I cannot find information on per-node communications data such as RSSI/LQI and the network routeing tables.

This sort of data must be available in some form as there are hardware devices to help diagnose communications problems for enthusiasts. But it makes no sense to me to use a hardware solution at the 10k+ installation site scale.

Is there a standard API for network related metrics?




If I understand correctly, you’re looking to build some of the functionality that tools like the Z-Wave Alliance’s CIT provide into your Z/IP Client.

zipgateway provides you with performance metrics for transmissions that you ask it to perform. It gives the information back along with the ACK message, as a header extension. You can get all the details for that in the Command Class spec. document, as part of the Z/IP Packet Command Class - specifically section - Installation and Maintenance

zipgateway also stores a lot of this data for you, which you can access through the Network Management Installation and Maintenance Command Class. The details for this Command Class are in the same document I linked above, look for section 4.5.11

Finally, in mesh networks (like Z-Wave) the link budget between two nodes makes a better metric for link quality than RSSI measurements. For devices that support the Command Class, you can send Powerlevel Command Class commands to them to determine the link budget. You can do this not only between zipgateway and a node, but also asking a node to test its link with a 3rd node. The details for this Command Class are also in the document linked above, look for section 4.7


Thanks. Looks like a great starting point.

I wasn’t particularly staring from a Z/IP model, but it’s certainly my preferred model. One of my lessons from looking at the many related protocols is that there’s a strong tendency among protocol specialists to emphasise the technologies that they know, rather than the impact on the consumer. This makes it much harder to get collaboration between entities on different protocols (eg if I’m building a value proposition for a family to share sensors and actuators in their house, car and what they wear, I need a much more homogeneous access model and to eliminate collaborations that happen below the IP(v6) level). I’m interested in seeing if my experience leads to the model that I expect.

On a related point, I’ve found that mesh networks are not great for what I’d call ‘unobserved systems’ as they’re not very stable over long periods of time. For such systems (which must spot their own failures and initiate a response), some software somewhere must get an oversight of what’s going on. So, I’d prefer a fail-fast model. ymmv :wink:


I can definitely appreciate where you’re head is at. If you’re planning on developing this toolkit out in the open, I would very much like to follow it and possibly contribute to it. Please let me know where I can do so if that’s the case.

I also understand your preference for a model that fails fast, but there’s a delicate balance to be struck when dealing with home mesh networks. You want to make sure that a failure is really a failure, and that it’s relevant before reporting it or taking action on it. I’ve experienced various setups where, for example, notifications of device failure are sent out every time a device fails to reply to a ping. In real-life scenarios, devices sometimes get unplugged (maybe a retrofit switch can’t handle the current a vacuum cleaner needs) or blocked (maybe furniture was temporarily rearranged for a party) intentionally. Sometimes people happen to be walking in front of the device, or standing in the doorway when the device was pinged, and that was enough for the reply to get lost. Immediately and continuously reporting these “failures” (which in some cases might not even be affecting the rest of the network, due to alternate routes through the mesh) is very annoying from the end-user perspective. This might not be so important if you’re building a tool that is strictly used for network installation, but it’s good idea to keep things like these in mind.


I’ll keep you informed of whatever can be open - and I’ll try to make that as much as possible as it gives much better feedback.

For background, I owned the design of what’s now BG’s Hive Home system (Zigbee based) and spent some time trying to rescue the networking approaches used in Lowe’s old system. There are lots of these networking protocols (Aarhus University’s EPIC project identified 96 a number of years back), and they often look like they were designed by non-computer scientists. Your scenario analysis is correct - you don’t want flooding with false +ves of failure, and you certainly don’t want to confuse/bombard the user with excess stuff.

otoh, I found that consumer and industrial installation of mesh networks don’t conform to many of the design assumption that are often based on battlefield scenarios with very dense node availability, an expectation that some will fail. The battlefield mesh protocols deliberately avoid a central point of control, but they break when there’s not enough connection redundancy as there’s no node that understands what’s going on. This creates frequent network partitions and unexplained lags/lost messages, which really confuses and annoys the user :wink:

From an economic point of view, supporting the more complex installations can easily become the dominant costs of running a service - especially if the business over-promises what’s possible.

Everything’s fine while it’s working. But then something strange happens and it’s impossible to work out what it was and often impossible to automatically restore order.