memutils: Replacement of libc string.h functions - currently only Dmemset() #2662

baziotis · 2019-07-04T19:47:40Z

As this is my first PR to D runtime, here are some notes:

You can refer to this repo in which the work progresses: https://github.com/baziotis/Dmemutils
Generally, I'm seeing tests only in unittest but these tests can't really be put in unittest, unless I miss something.
Progressively, more functions will be put in this file, that have common code. Such code is for example the SIMD intrinsics used in Dmemset. I think that a better choice would be to have such code in something like experimental/simd.d but I was not sure if that's desired.

src/core/experimental/memutils.d

wilzbach

A few initial comments. I think this initial PR will take up quite a bit of hard work as you need to get used to a few things, but the other ones should be a lot easier and straight-forward then.

src/core/experimental/memutils.d

…mset()

src/core/experimental/memutils.d

jpf91 · 2019-07-07T09:58:42Z

BTW, with these kind of things I highly recommend checking the ASM in GDC, as the compiler might silently turn your code back into a call to the C memset function ;-)

Seems to work fine in this case though: https://explore.dgnu.org/z/-uJfsJ

I'll need to have a look at the GCC sources to see if we can somehow disable this pattern matching / rewriting for individual functions, just to be safe.

baziotis · 2019-07-07T10:17:28Z

Add an -O3 there and you'll see it jumps to memset. :P
We have to take a look at it yes.

ibuclaw · 2019-07-07T10:31:03Z

Just a general high-level question, how is this meant to scale to the plethora of already supported targets? - ARM, MIPS, RISCV, PPC, SystemZ, HPPA, SPARC, and their 64-bit variants to name few.

Have considerations been made for strict alignment targets? BigEndian targets?

My assumption would be that the performance would be a net-loss for everyone, leaving some improvements left wanting for the naive implementation. Which could be done by handling small arrays with minimal branching, then setting values 32 or 64 bytes at a time depending on how big the array is.

jpf91 · 2019-07-07T10:33:35Z

@ibuclaw BTW, -fno-builtin seems to be broken:
https://explore.dgnu.org/z/I2RZju
https://godbolt.org/z/ZYp1AY

baziotis · 2019-07-07T10:55:09Z

@ibuclaw The project was initially focused on DMD and x86_64. For all the rest of the targets the goal was to just have a naive working implementation. That changed the last week, which meant throwing most of the (ASM) code away and I started again with intrinsics, optimizing for LDC etc. But now time is constrained. Seb and I try to make the best of the time left, but the community can influence us. So, I'm open to suggestions.

I didn't really understand what you meant about the naive implementation. Settings 32 or 64 bytes at a time requires SIMD, which is done in the SIMD implementation, along with other optimizations (reaching alignment and there was a switch fall-through for small sizes, but that had problems, so it forwards to naive. There are some other tweaks but those require ASM-level control).

The without SIMD optimizations are not a lot. They're things like the GNU algorithm, which pretty much reaches 4/8-byte alignment. But those things are trivial for a compiler to do it itself.

ibuclaw · 2019-07-07T12:30:45Z

Settings 32 or 64 bytes at a time requires SIMD, which is done in the SIMD implementation, along with other optimizations (reaching alignment and there was a switch fall-through for small sizes, but that had problems, so it forwards to naive. There are some other tweaks but those require ASM-level control).

// val = 0b00100000
ubyte val = 32;

// val32 = 0b00100000_00100000_00100000_00100000
uint val32 = -1u / 255 * c;

// val64 = 0b00100000_00100000_00100000_00100000_00100000_00100000_00100000_00100000
ulong val64 = val32 | (cast(ulong)val32 << 32);

Setting then becomes (make sure pointer is aligned!)

*cast(uint*)(dst + i) = val32;

or

*cast(ulong*)(dst + i) = val64;

ibuclaw · 2019-07-07T12:34:37Z

The without SIMD optimizations are not a lot. They're things like the GNU algorithm, which pretty much reaches 4/8-byte alignment. But those things are trivial for a compiler to do it itself.

I'm sure the compiler could go a long way by itself, but sometimes you need to give it a gentle nudge to make it go one better.

Anyway, I'll wait to see how this ends up, as things are still in flux.

baziotis · 2019-07-07T12:41:25Z

So, you mean 32/64 bits at a time, not bytes. Then yes, as I said, we're going back to the modified GNU algorithm: https://www.embedded.com/print/4024961 (that's for memcpy() but the "GNU algorithm" is thrown around when talking about the idea of "align to 4/8-byte boundary and do appropriate moves").
It's not a big thing to add, but you know better than be that the compiler will do way more crazy stuff than this.
For example, in the link that @jpf91 posted above: https://godbolt.org/z/ZYp1AY
The compiler reaches 16-byte alignment, uses SIMD and does a switch fall-through in the end. :P

Edit: Well, for clarification, it's not a switch fall-through. It seems worse than this since it doesn't use a jump table and goes byte-byte etc. but the idea is the same.

lesderid · 2019-07-30T14:02:16Z

src/core/experimental/memutils.d

+  the size of the array element) to `val`.
+  Otherwise, set T.sizeof bytes to `val` starting from the address of `dst`.
+*/
+void memset(T)(ref T dst, const ubyte val)


Is this undocumented (ddoc) on purpose?

You mean it should be:

/** * ... */

?

Yes, and a Params section (https://dlang.org/spec/ddoc.html#standard_sections) is probably best too.

Hmm, thanks. I'm not very accustomed yet to the logistics of contributing to D.

Probably you can add scope to the reference? Explicit nogc nothrow may be good too.

Yes, I'm not very familiar with scope but nogc, nothrow is good.

dlang-bot · 2019-07-30T14:32:45Z

Thanks for your pull request and interest in making D better, @baziotis! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
I have provided a detailed rationale explaining my changes
New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.

If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + druntime#2662"

lesderid · 2019-07-30T14:44:04Z

src/core/experimental/memutils.d

+ *  dst = Memory Destination whose bytes are to be set to `val`.
+ *
+ * Returns:
+ *  Nothing.


You can just leave the Returns section out if it's void. 😉

Haha, I didn't like it too. But I wanted a way to make a note that this memset is different from the libc in that it returns nothing. I'll add it as a N.B..

baziotis · 2019-07-30T20:20:03Z

I think something went wrong with buildkite that is not a problem of the PR.

wilzbach · 2019-07-30T21:45:32Z

see e.g. dlang/dmd#10244 (review)

baziotis · 2019-07-31T10:50:39Z

Sorry but I don't understand what I should do.

thewilsonator · 2019-07-31T11:06:20Z

Nothing, it is unrelated to this PR.

baziotis · 2019-07-31T11:11:28Z

Ah, ok, that's what I figured, just pointing to the problem. Thanks

JohanEngelen · 2019-08-02T15:24:45Z

One thing to take into account is that the LLVM optimizer recognizes memset and memcpy and knows the semantics of those functions. It uses that knowledge for optimization (e.g. if you memcpy something, the optimizer knows that the dst memory now contains the values of src), which can be a massive performance boost.
When you provide your own memset function, and the optimizer doesn't know the semantics of it, you may loose performance because of it. Cross-module inlining is tricky to work here:

your new memset needs to be compiled without the compiler turning the algorithm into a memset/memcpy call
application code needs to be compiled with the compiler turning any copying algorithm into memcpy or memset.
--> this cannot be done in one compiler optimization unit..............

Before using/enabling this on the optimizing compilers, please do extensive testing and assembly output checking.

The current set of tests is way too small. After just a few seconds of thinking about it I already see things untested: misaligned pointer, null, different memset sizes to trigger all corner cases of simd implementations (4-ply, 8-ply, ...), ...

JohanEngelen · 2019-08-02T15:28:20Z

src/core/experimental/memutils.d

+ * (whose count is the length of the array times
+ * the size of the array element) to `val`.
+ * Otherwise, set T.sizeof bytes to `val` starting from the address of `dst`.
+ * N.B.: Contrary to the C Standard Library memset(), this functions returns nothing.


why deviate from c stdlib memset on this point?

Because these utilities were not created with the idea to have the exact C interface. We decided to drop legacy C stuff that seem useless. Like the return value and the fact that memset gets an int instead of a byte.

I'm not saying this is necessarily the best idea ever, because no matter how irrelevant legacy stuff is, the thing is, people have been used to that for years.

baziotis · 2019-08-02T15:38:42Z

One thing to take into account is that the LLVM optimizer recognizes memset and memcpy and knows the semantics of those functions. It uses that knowledge for optimization (e.g. if you memcpy something, the optimizer knows that the dst memory now contains the values of src), which can be a massive performance boost.
When you provide your own memset function, and the optimizer doesn't know the semantics of it, you may loose performance because of it. Cross-module inlining is tricky to work here:

your new memset needs to be compiled without the compiler turning the algorithm into a memset/memcpy call

application code needs to be compiled with the compiler turning any copying algorithm into memcpy or memset.
--> this cannot be done in one compiler optimization unit..............

Before using/enabling this on the optimizing compilers, please do extensive testing and assembly output checking.

That is true. It's supposed to be done on GDC as well. It's supposed that on GDC, you can set the -fno-builtin but that seems broken: #2662 (comment)

As far as I have tested, the functions are not turned into LLVM / GDC counterparts, except for the naive versions, which is ok. Other than that, I'm not sure what exactly you seem to want tested.

Edit: Btw, as far as performance boost is concerned, things like assume_aligned would be great, but I don't know of a way to do it on either LDC or GDC.

The current set of tests is way too small. After just a few seconds of thinking about it I already see things untested: misaligned pointer, null, different memset sizes to trigger all corner cases of simd implementations (4-ply, 8-ply, ...), ...

The things you mention are tested except for null, which seems like a redundant test for such a performance critical function (as far as I'm aware, it's not done in any other public mem* functions).
The tests are in this file: https://github.com/dlang/druntime/pull/2662/files#diff-a414cc2209b63799adab22dd7971e59c

Meaning, except for the basic unittests.

JohanEngelen · 2019-08-02T19:15:40Z

One thing to take into account is that the LLVM optimizer recognizes memset and memcpy and knows the semantics of those functions. It uses that knowledge for optimization (e.g. if you memcpy something, the optimizer knows that the dst memory now contains the values of src), which can be a massive performance boost.
When you provide your own memset function, and the optimizer doesn't know the semantics of it, you may loose performance because of it. Cross-module inlining is tricky to work here:

your new memset needs to be compiled without the compiler turning the algorithm into a memset/memcpy call

application code needs to be compiled with the compiler turning any copying algorithm into memcpy or memset.
--> this cannot be done in one compiler optimization unit..............

Before using/enabling this on the optimizing compilers, please do extensive testing and assembly output checking.

That is true. It's supposed to be done on GDC as well. It's supposed that on GDC, you can set the -fno-builtin but that seems broken: #2662 (comment)

Using -fno-builtin may break if cross-module inlining is needed to get performance. (because in user code we don't want -fno-builtin and then memset may get local codegen without being inlined and inside it the algorithm going to be turned into memset. This heavily depends on how cross-module inlining is implemented though. But imagine your memset being instantiated in user code for a specific type T; then Dmemset gets inlined into that and the code is converted to a call to memset. Is that bad? Is the plan to supply a C stdlib compatible memset without the need to link to C stdlib? Sorry if unclear, I'm trying to sketch potential issues in more complex use cases).

If you haven't seen it yet, this thread is interesting: https://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

As far as I have tested, the functions are not turned into LLVM / GDC counterparts, except for the naive versions, which is ok. Other than that, I'm not sure what exactly you seem to want tested.

I meant that the performance of this may heavily depend on how it is used in user code + separate compilation. Will the function be opaque or not to the optimizer? Think of user code like this:

ubyte[100] buf;
memset(&buf[0], 5, 5);
auto a = buf[3]; // will optimizer know that this is 5?

The current set of tests is way too small. After just a few seconds of thinking about it I already see things untested: misaligned pointer, null, different memset sizes to trigger all corner cases of simd implementations (4-ply, 8-ply, ...), ...

The things you mention are tested except for null, which seems like a redundant test for such a performance critical function (as far as I'm aware, it's not done in any other public mem* functions).
The tests are in this file: https://github.com/dlang/druntime/pull/2662/files#diff-a414cc2209b63799adab22dd7971e59c

Ah yes I see now how the tests work.
What is still missing from the tests is whether anything is overwritten that shouldn't be overwritten. (If you start implementing memcpy you should also check for buffer overreads.)
About null: whether the function is performance critical or not is not relevant for testing a corner case input: just test that memset([], 1) doesn't do anything. Similar to testing whether it works for structs, just sanity checking and making sure in the future people are not going to break things accidentally.

By the way, perhaps more clear to have all functions take a ubyte instead of an int (because they are going to write only the ubyte)?

baziotis · 2019-08-02T21:23:01Z

Using -fno-builtin may break if cross-module inlining is needed to get performance. (because in user code we don't want -fno-builtin and then memset may get local codegen without being inlined and inside it the algorithm going to be turned into memset. This heavily depends on how cross-module inlining is implemented though. But imagine your memset being instantiated in user code for a specific type T; then Dmemset gets inlined into that and the code is converted to a call to memset. Is that bad? Is the plan to supply a C stdlib compatible memset without the need to link to C stdlib? Sorry if unclear, I'm trying to sketch potential issues in more complex use cases).

Ah, I see ok. I mentioned -fno-builtin for.. how should I say.. microbenchmarks (?) I don't know, but basically, when you want to benchmark some memset you've made, you have some benchmark code and you try to verify that LLVM uses your code and not his own patch, so -fno-builtin is useful (although again, it currently seems broken). Anyway, I get your point. Ok, so I don't know what memset LLVM and GCC uses. Is it better than mine or is it a more conservative? And on what platforms is it replaced ? And on what platforms is it faster and vice versa ?

The way I see it, my memset() is targetted to x86_64. As far as I have tested, neither LDC or GDC replace it (the non-naive version) with their own. And mine is faster last time I checked.
Now, in other platforms, the compiler always knows better what to do, since my implementation is the most trivial one (so it is trivial for the compiler to know that "hey, this is a e.g. memcpy(), replace it with what you think is better"). If versions for other platforms get added, those are supposed to be better than the LLVM / GCC patch, so I think we're ok there as well.

Although ideally, we should have some control besides flags for saying "we don't want this to be replaced" or smth.

Ok, this was somewhat long, I hope it was clear.

If you haven't seen it yet, this thread is interesting: https://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html

No, thanks! I'll check it.

I meant that the performance of this may heavily depend on how it is used in user code + separate compilation. Will the function be opaque or not to the optimizer? Think of user code like this:
ubyte[100] buf;
memset(&buf[0], 5, 5);
auto a = buf[3]; // will optimizer know that this is 5? 

Yes, this is an important question. It adds to the questions I wrote above and it's on the point. Well, I don't know how one can tell the compiler that. As I said, I don't even know how to have something like assume_aligned, which is quite important.

Ah yes I see now how the tests work.
What is still missing from the tests is whether anything is overwritten that shouldn't be overwritten. (If you start implementing memcpy you should also check for buffer overreads.)

Good point. I have tested by repeatedly inspecting on the debugger, but a more formal test should be added.

Here's the memcpy PR: #2687
memmove has also been implemented, but it's not PR'd yet (since these will take quite some time to be merged anyway).

About null: whether the function is performance critical or not is not relevant for testing a corner case input: just test that memset([], 1) doesn't do anything. Similar to testing whether it works for structs, just sanity checking and making sure in the future people are not going to break things accidentally.

Well, it's just another if. It's not the end of the world, but it's generally not added.
For example: https://github.com/tpn/agner/blob/master/asmlib/asmlibSrc/memset64.asm#L189
I don't think there's a right answer, it's just a decision. We can change it.

By the way, perhaps more clear to have all functions take a ubyte instead of an int (because they are going to write only the ubyte)?

Hmm, you mean the Dmemset right? Because the memset (I mean, my memset not the libc, the one that takes arrays, refs etc.) do take ubyte.

JohanEngelen · 2019-08-03T10:04:25Z

About null: whether the function is performance critical or not is not relevant for testing a corner case input: just test that memset([], 1) doesn't do anything. Similar to testing whether it works for structs, just sanity checking and making sure in the future people are not going to break things accidentally.

Well, it's just another if. It's not the end of the world, but it's generally not added.
For example: https://github.com/tpn/agner/blob/master/asmlib/asmlibSrc/memset64.asm#L189
I don't think there's a right answer, it's just a decision. We can change it.

Ah I didn't mean to add an if to the memset functions.
Currently, I think it already works for an empty array (length == 0). Just add a testcase that tests that indeed it doesn't crash or something. Same as adding a testcase like this would be nice:

struct A { int i; ubyte b; };
memset(A[] or A...);

By the way, perhaps more clear to have all functions take a ubyte instead of an int (because they are going to write only the ubyte)?

Hmm, you mean the Dmemset right? Because the memset (I mean, my memset not the libc, the one that takes arrays, refs etc.) do take ubyte.

Yes indeed.

baziotis · 2019-08-03T12:17:01Z

Ah I didn't mean to add an if to the memset functions.
Same as adding a testcase like this would be nice:
struct A { int i; ubyte b; };
memset(A[] or A...);

Sorry, I did not get that one.

Hmm, you mean the Dmemset right? Because the memset (I mean, my memset not the libc, the one that takes arrays, refs etc.) do take ubyte.

Yes indeed.

Yes, ubyte makes more sense for Dmemset as well, I just left it because Dmemset has a more C-like interface. However, since it already has deviated from it, it's better to change to ubyte.

TurkeyMan · 2019-08-05T17:25:36Z

src/core/experimental/memutils.d

+{
+    /* SIMD implementation
+     */
+    private void Dmemset(void *d, const uint val, size_t n) nothrow @nogc


I can't see a path for val == 0... It is usually possible to make zeromem substantially faster.
But again, none of this code should apply to LDC/GDC, those compilers have intrinsics for this stuff; this should be considered strictly for DMD.
For instance, on PowerPC, there's an instruction dcbz (data cache-block zero) which resets an entire 256byte cache line to zero in one instantaneous operation... I can't see your PPC branch that takes advantage of 256byte alignment and dcbz... but the backends all do.
Assuming this is targetting DMD, then your assumptions that the compiler won't do anything interesting, and that the target is x64 are both reasonable assumptions, otherwise I can't get behind any of this code.

I can't see a path for val == 0... It is usually possible to make zeromem substantially faster.

No memset that I know of does anything different for val == 0 and targets x64.

But again, none of this code should apply to LDC/GDC, those compilers have intrinsics for this stuff; this should be considered strictly for DMD.

Check this: #2687 (comment)

PowerPC was not a target of this project.

Assuming this is targetting DMD, then your assumptions that the compiler won't do anything interesting, and that the target is x64 are both reasonable assumptions, otherwise I can't get behind any of this code.

The focus was initially DMD, then LDC and GDC by not using their intrinsics. It's again the same question - What happens when libc is not available and do we care?

PowerPC was not a target of this project.

Right, that's my point, and why I think it's not a good idea to re-implement these functions. There's too many versions of them to worry about, and maintaining great perf is a moving target as arch like x86 and arm evolve.

What happens when libc is not available and do we care?

libc is always available... what is a case when it's not?
I'd also like to know if the intrinsics definitely call out to libc, or if the intrinsics call into the compilers runtime lib? What is the symbol that the intrinsics call?

I'll have to look at this. Let me be clear again, I don't know for a fact that they call libc, it just seemed that way as they execute the same code. I'll definitely have to look at that if we move forward with this PR.

TurkeyMan · 2019-08-05T17:31:04Z

src/core/experimental/memutils.d

+            return;
+        }
+        void *temp = d + n - 0x10;                  // Used for the last 32 bytes
+        const uint v = val * 0x01010101;            // Broadcast c to all 4 bytes


mul is almost never a good idea. I suspect the mul may be a bottleneck in the ~32-128 byte range. If you're gonna do it, why not extend to 8 bytes and get more value from it?

Do you have a reference for this code? It doesn't look very optimal to me. If you're gonna use SSE, there are broadcast and permute functions, which not introduce hazards as bad as mul...

Yes, it's inspired by Agner Fog. I didn't copy code from him but I have read his optimization manual. This mul trick is his. And no, a mul in x86 is not a problem nowadays.

Our code turned out to be similar e.g. his AVX version: https://github.com/tpn/agner/blob/master/asmlib/asmlibSrc/memset64.asm#L188

It's also inspired by GCC.

Edit: Oh, and there's no point in doing in 8 bytes. It is immediately broadcasted in an XMM register.

I'm hyper-aware of this mul trick, I've been doing this for decades, but it absolutely IS a problem 'nowdays'... I've written a lot of code using mul-tricks like this, and been surprised when they become an unintuitive bottleneck.
It might not be a problem here, because I think it's usually the case that memset is not called in a hot-loop, but multiplies usually have a lot of limitations on the pipeline.
There's usually only one imul pipeline, and while the latency might not be so bad these days, the circuit is kinda recursive, so it can't accept a new instruction per cycle like add/sub circuits, so if there are multiple mul's contending for the limited imul resource, they create stalling hazards on each other.
imul is often used in address calculation, so there's a reasonable possibility of contention, but there's an if above this, so the pipeline might already have flushed...

I do a lot of work with colour precision scaling using this same trick; the imul is almost always the limiting factor in my loops.

Anyway, I only say this because you're shoving the value straight into SSE in the following line, where you can use much faster permute operations to broadcast the value very easily (like pshufb, or complements on other architectures).

I didn't know all that stuff about muls limiting the pipeline. I just knew that the cycles they consume are about the same as say add / sub. Is there somewhere I can read more about those?

But the probabilities of Agner not having thought this is close to 0. :P

baziotis · 2019-09-10T11:43:15Z

Closed due to not being useful - more info here.

rainers reviewed Jul 5, 2019

View reviewed changes

src/core/experimental/memutils.d Outdated Show resolved Hide resolved

rainers reviewed Jul 5, 2019

View reviewed changes

src/core/experimental/memutils.d Outdated Show resolved Hide resolved

rainers reviewed Jul 5, 2019

View reviewed changes

src/core/experimental/memutils.d Outdated Show resolved Hide resolved

wilzbach reviewed Jul 5, 2019

View reviewed changes

baziotis added 16 commits July 5, 2019 20:15

memutils: Replacement of libc string.h functions - currently only Dme…

847140e

…mset()

Style fix

f991173

Versioning fix

ea2ce59

Versioning fix vol. 2

bac120f

Independency of std.traits

ff7e755

Minor fixes/changes

497e53f

Style and layout changes

c52c099

Moved tests to test folder

6ebec4b

More naming and style changes

57552ed

Minor fix

60b3967

Versioning improvement

4faa8f8

Move std.traits code to core.internal.traits

a161b98

Naming fix

5da39a9

Fix in using non-existent code in internal.traits unittests

cc6d019

Fix for uint vs ubyte in memsetNaive

7b9eb3c

Removed escaping from tests in memutils

08d044f

jpf91 reviewed Jul 7, 2019

View reviewed changes

src/core/experimental/memutils.d Outdated Show resolved Hide resolved

Versioning improvement

d611a18

thewilsonator requested review from MartinNowak, andralex and schveiguy as code owners July 30, 2019 12:53

lesderid reviewed Jul 30, 2019

View reviewed changes

Doc improvement

4e6654b

Doc improvement 2

d7b8a0b

lesderid reviewed Jul 30, 2019

View reviewed changes

baziotis added 2 commits July 30, 2019 17:47

Minor changes

83541f7

Changed Returns to N.B.

b9bc30c

JohanEngelen reviewed Aug 2, 2019

View reviewed changes

baziotis closed this Aug 2, 2019

baziotis reopened this Aug 2, 2019

baziotis added 2 commits August 3, 2019 15:18

Add test for empty array

1204a8b

Added @nogc nothrow

a4c7a8d

TurkeyMan reviewed Aug 5, 2019

View reviewed changes

baziotis closed this Sep 10, 2019

Uh oh!

memutils: Replacement of libc string.h functions - currently only Dmemset() #2662

memutils: Replacement of libc string.h functions - currently only Dmemset() #2662

Uh oh!

Conversation

baziotis commented Jul 4, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wilzbach left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jpf91 commented Jul 7, 2019

Uh oh!

baziotis commented Jul 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibuclaw commented Jul 7, 2019

Uh oh!

jpf91 commented Jul 7, 2019

Uh oh!

baziotis commented Jul 7, 2019

Uh oh!

ibuclaw commented Jul 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibuclaw commented Jul 7, 2019

Uh oh!

baziotis commented Jul 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

baziotis Jul 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dlang-bot commented Jul 30, 2019

Bugzilla references

Testing this PR locally

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

baziotis commented Jul 30, 2019

Uh oh!

wilzbach commented Jul 30, 2019

Uh oh!

baziotis commented Jul 31, 2019

Uh oh!

thewilsonator commented Jul 31, 2019

Uh oh!

baziotis commented Jul 31, 2019

Uh oh!

JohanEngelen commented Aug 2, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

baziotis commented Jul 7, 2019 •

edited

Loading

ibuclaw commented Jul 7, 2019 •

edited

Loading

baziotis commented Jul 7, 2019 •

edited

Loading

baziotis Jul 30, 2019 •

edited

Loading

baziotis commented Aug 2, 2019 •

edited

Loading

TurkeyMan Aug 5, 2019 •

edited

Loading

baziotis Aug 5, 2019 •

edited

Loading