Skip to content

add hal_extend_int function#3936

Open
rmu75 wants to merge 1 commit intoLinuxCNC:masterfrom
rmu75:rs/hal_extend_int
Open

add hal_extend_int function#3936
rmu75 wants to merge 1 commit intoLinuxCNC:masterfrom
rmu75:rs/hal_extend_int

Conversation

@rmu75
Copy link
Copy Markdown
Collaborator

@rmu75 rmu75 commented Apr 14, 2026

helper function to deal with wrap around and extension of lower-width counters to 64bit ints.

Comment thread src/hal/hal_lib.c Outdated
Comment thread src/hal/hal.h
Comment thread src/hal/hal_lib.c Outdated
@rmu75 rmu75 force-pushed the rs/hal_extend_int branch 3 times, most recently from b792826 to 7eafc42 Compare April 15, 2026 08:37
@BsAtHome
Copy link
Copy Markdown
Contributor

The explanation does not tell us what the function is supposed to do, why it exists and how it accomplished the desired outcome. Neither does it describe arguments, like what values old, newlow and nbits represent.

Also, IMO, the name hal_extend_int() is a misnomer for what is actually performed.

@rmu75
Copy link
Copy Markdown
Collaborator Author

rmu75 commented Apr 15, 2026

The explanation does not tell us what the function is supposed to do, why it exists and how it accomplished the desired outcome. Neither does it describe arguments, like what values old, newlow and nbits represent.

Also, IMO, the name hal_extend_int() is a misnomer for what is actually performed.

/** hal_extend_int() extends a counter value with nbits to 64 bits.
    Increments between updates need to be less than 2**(nbits-1).
    Call with current 64bit counter value to be updated, new
    lower-bit counter value and number of bits of counter.
    @returns new counter value.
    Code by Jeff Epler. */

hmm. I guess my description presupposes some context. It could certainly be more verbose. Regarding the name, I didn't and still don't have a good idea how to name it. Feel free to suggest something. @andypugh opinion re. name?

@BsAtHome
Copy link
Copy Markdown
Contributor

hmm. I guess my description presupposes some context. It could certainly be more verbose.

Yes, and that is exactly the problem you yourself pointed out and warned about when adding "smart code".

Regarding the name, I didn't and still don't have a good idea how to name it. Feel free to suggest something. @andypugh opinion re. name?

What is its primary function? Usually, name follows function.

@BsAtHome
Copy link
Copy Markdown
Contributor

Just one more thing... This whole routine is being added because the usual algorithm has one or two conditionals in performing a fixup.

Then why is this being added as a subroutine involving two non-local jumps (call and ret), disregarding possible PLT indirection, and at least two stack operations, assuming the arguments are register passed? I fail to see the advantage in that.

@rmu75
Copy link
Copy Markdown
Collaborator Author

rmu75 commented Apr 15, 2026

I think the "cleverness" is adequately addressed in the code comment.

I propose this code to be added to have one (and only one) correct implementation that can deal with all cases, not to avoid branches or whatever. Correcctness trumps speed.

Unconditional jumps are cheap. I don't think a macro is warranted here, neither pollution of the header file with implementation of an inline function or a ".inc" code include file.

@BsAtHome
Copy link
Copy Markdown
Contributor

I think the "cleverness" is adequately addressed in the code comment.

We apparently have very different ideas of "adequately addressed".

I propose this code to be added to have one (and only one) correct implementation that can deal with all cases, not to avoid branches or whatever. Correcctness trumps speed.

I can accept correctness, in code. But I fail to see that the implementation is actually correct in the greater context.

Unconditional jumps are cheap.

No, they are not cheap. These are not simple jumps and involve stack frames. If this code is PLT indirected, then there is one more indirect jump involved too.

I don't think a macro is warranted here, neither pollution of the header file with implementation of an inline function or a ".inc" code include file.

It shouldn't be a macro as such. It should be an inline function that is always inlined. Then the compiler has a chance of performing optimizations and is not forced to create and tear down a stack frame construct (requiring flushing local register scheduling).

@rmu75 rmu75 force-pushed the rs/hal_extend_int branch from 7eafc42 to 141c035 Compare April 15, 2026 11:51
helper function to deal with wrap around and extension of lower-width counters to 64bit ints.
@rmu75 rmu75 force-pushed the rs/hal_extend_int branch from 141c035 to 4dc0e03 Compare April 15, 2026 11:52
Comment thread src/hal/hal.h
Comment on lines +711 to +712
extern rtapi_s64 hal_extend_int(rtapi_s64 old, rtapi_s64 newlow, int nbits);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prototype is effectively superfluous unless you try to link in a system that does both not inline and not define that static function locally as static. But, since we are using a C compiler that actually does inline, I do not see the point of the prototype.

Or is there something that I missed?

Comment thread src/hal/hal.h
Comment on lines +1025 to +1027
Attention has to be paied that the magnitude of increments / decrements
of the counter stay within 1<<(nshift-1) between calls to this function,
else the wrap around will be counted in the wrong direction.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may think that I am pedantic here, but I'm just reading from a blanked mind (like one from 5 years in the future) and try to fit that into why I would need this function.

It still does not explain what the arguments old and newlow actually are in the context of extending a counter. Which of these arguments is the counter?
The names suggest that there is something "old", but it is unclear what is old. There also is something "new" and apparently also "low" at the same time. But I cannot fathom, from the description, how these names fit into the algorithm.

BTW, the name hal_extend_counter() does seem much more appropriate, but there is more going on than just extending the counter. It is calculating a counter increment from the difference of the arguments and adding that difference to one of the arguments. This exploits the behaviour of unsigned verses two's complement calculations. It uses smaller values as source into a larger register set as if the calculation was performed on a "nbits" register in an unsigned operation (exploiting overflow behaviour) casted to a two's complement result. The final calculation is furthermore (ab)using the unsigned/signed duality. How exactly is this an "easy" algorithm to follow?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment next to prototype. This comment here in the impl explains what the code is doing and is inside. IDE will display the other comment if oyu hover over a function call.

In fact it doesn't matter if you work with signed or unsigned counter values, the important thing is that the difference has the proper sign and is calculated with width nbits. instead of figuring out the high bits, shifting it up gets proper overflow behaviour and the "extra (0) bits" towards LSB don't matter.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before i force push and trigger yet another senseless CI run what about

    /* Extend low-bit-width counter value to 64bit, counting wrap arounds resp.
       "rotations" in additional bits. 

       see https://github.com/LinuxCNC/linuxcnc/pull/3932#issuecomment-4239206615
       This code avoids branches and undefined behaviour due to signed overflow
       Idea from MicroPython.

       To avoid messy code to sign extend and overflowa lower-width value to
       64bits, this code shifts the counter value to the end of a 64bit int.
       Thus the calculated delta value overflows correctly and gets the correct
       sign.

       Shifting back the delta assures the sign is extended properly.
       Addition as unsigned avoids signed overflow, which would be undefined
       behaviour.

       Attention has to be paid that the magnitude of increments / decrements
       of the counter stay within 1<<(nshift-1) between calls to this function,
       else the wrap around will be counted in the wrong direction.

       Heuristically, if your encoder can do more than half a rotation between
       calls to this function, it is impossible to deduce which direction it
       went.

       Code contributed by Jeff Epler.

       Example with 3bit absolute hardware encoder and 8bit extended counter
       (nshift would be 5):

       counter at 7, transition from 111 (7) to 000 (0) should increment the
       extended counter to 8.

       call hal_extend_counter(7, 0, 3)
       oldlow_shifted = 7<<5 = 224 (1110 0000)
       newlow_shifted = 0<<5 = 0
       delta_shifted = 0 - 224 = -224 (1 0010 0000) = 32 (0010 0000) truncated to 8 bits 
       old + (32 >> 5) = 7 + 1 = 8
    */

Copy link
Copy Markdown
Contributor

@BsAtHome BsAtHome Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion:
hal_extend_counter(): Calculate the actual counter value from one reading to the next while compensating for counter wrapping.
Counters with a limited bit-width will wrap while counting up or down. You need an unwrapped counter value to calculate the actual counter difference between readings. This function calculates the new reading's actual value based on the current counter value (newlow), the bit-width (nbits) of the counter and the previous (old) counter value. This algorithm uses a branchless approach.

For any set of bits in a counter, we extend its width to 64 bits by moving the most significant bit of the counter into the most significant bit of the processor register. This creates a counter as if it were a full register's width and the difference between the counter positions will have the proper sign, as calculated by the processor. The difference is then corrected back, preserving sign, to be of the original counter's bit-width. This calculated difference is added to the old counter value and returned as the actual new counter's value. The returned value thus mitigates any counter wrapping and preserves counting direction.

The limit when using N-bit bit-width limited counters is that the maximum count difference between subsequent invocation may not be larger that 2**(N-1)-1. Otherwise it is impossible to determine the direction of the counter.

Other comments:

Shifting back the delta assures the sign is extended properly.

That is very poorly defined behaviour. Right shifting may use arithmetic shift or logic shift. Depending on implementation specific behaviour at this level may not be wise. Right shifting signed values is generally bad practice and should be replaced by division (diff_shifted >> nshift => diff_shifted / (1LL << nshift)).

Addition as unsigned avoids signed overflow, which would be undefined behaviour.

You are saying that "signed overflow is undefined" is problematic and asserting that "unsigned overflow is defined". They are both problematic. On most architectures (all in practice) they work exactly the same way because there is only one adder. It is the interpretation of status flags afterwards that introduces the problem.

Replacing the code with only signed values produces exactly the same result:

    rtapi_s64 oldlow_shifted = old << nshift;
    rtapi_s64 newlow_shifted = newlow << nshift;
    rtapi_s64 diff_shifted = newlow_shifted - oldlow_shifted;
    return old + (diff_shifted >> nshift);

The reason for this to work is that both the signed and unsigned 64-bit values behave the same in addition and subtraction, as well as left shift. The code was created to be branch-free and it is only in conditional branches that the difference between signed and unsigned becomes potentially problematic. Therefore, I fail to see the argument in using casts to unsigned.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I think that hal_unwrap_counter() would be a better name as it describes what actually is happening.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counters with a limited bit-width will wrap while counting up or down.

Maybe "Hardware counters typically communicate a limited number of bits and will wrap when counting up or down"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The phenomenon of wrapping is not limited to hardware counters. It cal also happen in interfacing in general.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without cast to unsigned, the compiler may determine in some specific circumstances that some values can overflow

Which circumstances?

Copy link
Copy Markdown
Collaborator Author

@rmu75 rmu75 Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which circumstances?
please look at the whole comment and the link to the documentation of gcc.

That is hand waving. I did read the docs and cannot find the "specific conditions" you are referring to.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right shift of signed ints is in theory implementation defined but in practice every implementation does arithmetic shift.

https://cmu-sei.github.io/secure-coding-standards/sei-cert-c-coding-standard/recommendations/integers-int/int13-c/ (risk assessment: severity high)

@rmu75
Copy link
Copy Markdown
Collaborator Author

rmu75 commented Apr 16, 2026 via email

@BsAtHome
Copy link
Copy Markdown
Contributor

Lets me offer a compromise.

The cast to unsigned is fine. It doesn't hurt as such.

The right shift of a signed type does bother me. However, gcc supports it. You need to verify and make sure that clang also handles the right shift case like gcc. Additionally, you need to add a preprocessor test to check for gcc/clang (and their respective lowest version if applicable) and bail out with an error message if that is not what is used.

The advantage with the shift is that the whole algorithm is reduced to six assembly instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants