Skip to content

Commit 6ba8c8c

Browse files
committed
Simplify and speed up the computation of the call-node-specific line timings / address timings with the help of getCallNodeFramePerStack.
1 parent 9add0bf commit 6ba8c8c

5 files changed

Lines changed: 107 additions & 656 deletions

File tree

src/profile-logic/address-timings.ts

Lines changed: 36 additions & 330 deletions
Original file line numberDiff line numberDiff line change
@@ -72,16 +72,12 @@ import type {
7272
FuncTable,
7373
StackTable,
7474
SamplesLikeTable,
75-
IndexIntoCallNodeTable,
7675
IndexIntoNativeSymbolTable,
7776
StackAddressInfo,
7877
AddressTimings,
7978
Address,
8079
} from 'firefox-profiler/types';
8180

82-
import { getMatchingAncestorStackForInvertedCallNode } from './profile-data';
83-
import type { CallNodeInfo, CallNodeInfoInverted } from './call-node-info';
84-
8581
/**
8682
* For each stack in `stackTable`, and one specific native symbol, compute the
8783
* sets of addresses for frames belonging to that native symbol that are hit by
@@ -180,312 +176,6 @@ export function getStackAddressInfo(
180176
};
181177
}
182178

183-
/**
184-
* Gathers the addresses which are hit by a given call node.
185-
* This is different from `getStackAddressInfo`: `getStackAddressInfo` counts
186-
* address hits anywhere in the stack, and this function only counts hits *in
187-
* the given call node*.
188-
*
189-
* This is useful when opening the assembly view from a call node: We can
190-
* directly jump to the place in the assembly where *this particular call node*
191-
* spends its time.
192-
*
193-
* Returns a StackAddressInfo object for the given stackTable and for the library
194-
* which contains the call node's func.
195-
*/
196-
export function getStackAddressInfoForCallNode(
197-
stackTable: StackTable,
198-
frameTable: FrameTable,
199-
callNodeIndex: IndexIntoCallNodeTable,
200-
callNodeInfo: CallNodeInfo,
201-
nativeSymbol: IndexIntoNativeSymbolTable
202-
): StackAddressInfo {
203-
const callNodeInfoInverted = callNodeInfo.asInverted();
204-
return callNodeInfoInverted !== null
205-
? getStackAddressInfoForCallNodeInverted(
206-
stackTable,
207-
frameTable,
208-
callNodeIndex,
209-
callNodeInfoInverted,
210-
nativeSymbol
211-
)
212-
: getStackAddressInfoForCallNodeNonInverted(
213-
stackTable,
214-
frameTable,
215-
callNodeIndex,
216-
callNodeInfo,
217-
nativeSymbol
218-
);
219-
}
220-
221-
/**
222-
* This function handles the non-inverted case of getStackAddressInfoForCallNode.
223-
*
224-
* Gathers the addresses which are hit by a given call node in a given native
225-
* symbol.
226-
*
227-
* This is best explained with an example. We first start with a case that does
228-
* not have any inlining, because this is already complicated enough.
229-
*
230-
* Let the call node be the node for the call path [A, B, C].
231-
* Let the native symbol be C.
232-
* Let every frame have inlineDepth:0.
233-
* Let there be a native symbol for every func, with the same name as the func.
234-
* Let this be the stack tree:
235-
*
236-
* - stack 1, func A
237-
* - stack 2, func B
238-
* - stack 3, func C, address 0x30
239-
* - stack 4, func C, address 0x40
240-
* - stack 5, func B
241-
* - stack 6, func C, address 0x60
242-
* - stack 7, func C, address 0x70
243-
* - stack 8, func D
244-
* - stack 9, func E
245-
* - stack 10, func F
246-
*
247-
* This maps to the following call tree:
248-
*
249-
* - call node 1, func A
250-
* - call node 2, func B
251-
* - call node 3, func C
252-
* - call node 4, func D
253-
* - call node 5, func E
254-
* - call node 6, func F
255-
*
256-
* The call path [A, B, C] uniquely identifies call node 3.
257-
* The following stacks all "collapse into" ("map to") call node 3:
258-
* stack 3, 4, 6 and 7.
259-
* Stack 8 maps to call node 4, which is a child of call node 3.
260-
* Stacks 1, 2, 5, 9 and 10 are outside the call path [A, B, C].
261-
*
262-
* In this function, we only compute "address hits" that are contributed to
263-
* the given call node.
264-
* Stacks 3, 4, 6 and 7 all contribute their time both as "self time"
265-
* and as "total time" to call node 3, at the addresses 0x30, 0x40, 0x60,
266-
* and 0x70, respectively.
267-
* Stack 8 also hits call node 3 at address 0x70, but does not contribute to
268-
* call node 3's "self time", it only contributes to its "total time".
269-
* Stacks 1, 2, 5, 9 and 10 don't contribute to call node 3's self or total time.
270-
*
271-
* Now here's an example *with* inlining.
272-
*
273-
* Let the call node be the node for the call path [A, B, C].
274-
* Let the native symbol be B.
275-
* Let this be the stack tree:
276-
*
277-
* - stack 1, func A, nativeSymbol A
278-
* - stack 2, func B, nativeSymbol B, address 0x40
279-
* - stack 3, func C, nativeSymbol B, address 0x40, inlineDepth 1
280-
* - stack 4, func B, nativeSymbol B, address 0x45
281-
* - stack 5, func C, nativeSymbol B, address 0x45, inlineDepth 1
282-
* - stack 6, func D, nativeSymbol D
283-
* - stack 7, func E, nativeSymbol E
284-
* - stack 8, func A, nativeSymbol A, address 0x30
285-
* - stack 9, func B, nativeSymbol A, address 0x30, inlineDepth 1
286-
* - stack 10, func C, nativeSymbol A, address 0x30, inlineDepth 2
287-
*
288-
* This maps to the following call tree:
289-
*
290-
* - call node 1, func A
291-
* - call node 2, func B
292-
* - call node 3, func C
293-
* - call node 4, func D
294-
* - call node 5, func E
295-
*
296-
* The funky part here is that call node 3 has frames from two different native
297-
* symbols: Two from native symbol B, and one from native symbol A. That's
298-
* because B is present both as its own native symbol (separate outer function)
299-
* and as an inlined call from A. In other words, C has been inlined both into
300-
* a standalone B and also into another copy of B which was inlined into A.
301-
*
302-
* This means that, if the user double clicks call node 3, there are two
303-
* different symbols for which we may want to display the assembly code. And
304-
* depending on whether the assembly for A or for B is displayed, we want to
305-
* call this function for a different native symbol.
306-
*
307-
* In this example, we call this function for native symbol B.
308-
*
309-
* The call path [A, B, C] uniquely identifies call node 3.
310-
* The following stacks all "collapse into" ("map to") call node 3:
311-
* stack 3, 5 and 10. However, only stacks 3 and 5 belong to native symbol B;
312-
* stack 10 belongs to native symbol A.
313-
* Stack 6 maps to call node 4, which is a child of call node 3.
314-
* Stacks 1, 2, 4, 7, 8 and 9 are outside the call path [A, B, C].
315-
*
316-
* Stacks 3 and 5 both contribute their time both as "self time" and as "total
317-
* time" to call node 3 and native symbol B, at the addresses 0x40 and 0x45,
318-
* respectively. Stack 10 has the right call node but the wrong native symbol,
319-
* so it contributes to neither self nor total time.
320-
* Stack 6 also hits call node 3 at address 0x45, but does not contribute to
321-
* call node 3's "self time", it only contributes to its "total time".
322-
* Stacks 1, 2, 4, 7, 8 and 9 don't contribute to call node 3's self or total time.
323-
*
324-
* ---
325-
*
326-
* All stacks can contribute no more than one address in the given call node.
327-
* This is different from the getStackAddressInfo function above, where each
328-
* stack can hit many addresses of the same native symbol, because all of the ancestor
329-
* stacks are taken into account, rather than just one of them. Concretely,
330-
* this means that in the returned StackAddressInfo, each stackAddresses[stack]
331-
* set will only contain at most one element.
332-
*
333-
* The returned StackAddressInfo is computed as follows:
334-
* selfAddress[stack]:
335-
* For stacks that map to the given call node and whose nativeSymbol is the
336-
* given native symbol, this is stack.frame.address.
337-
* For all other stacks this is null.
338-
* stackAddresses[stack]:
339-
* For stacks that map to the given call node or one of its descendant
340-
* call nodes, and whose nativeSymbol is the given native symbol, this is a
341-
* set containing one element, which is ancestorStack.frame.address, where
342-
* ancestorStack maps to the given call node.
343-
* For all other stacks, this is null.
344-
*/
345-
export function getStackAddressInfoForCallNodeNonInverted(
346-
stackTable: StackTable,
347-
frameTable: FrameTable,
348-
callNodeIndex: IndexIntoCallNodeTable,
349-
callNodeInfo: CallNodeInfo,
350-
nativeSymbol: IndexIntoNativeSymbolTable
351-
): StackAddressInfo {
352-
const stackIndexToCallNodeIndex =
353-
callNodeInfo.getStackIndexToNonInvertedCallNodeIndex();
354-
355-
// "self address" == "the address which a stack's self time is contributed to"
356-
const callNodeSelfAddressForAllStacks = [];
357-
// "total addresses" == "the set of addresses whose total time this stack contributes to"
358-
// Either null or a single-element set.
359-
const callNodeTotalAddressesForAllStacks: Array<Set<Address> | null> = [];
360-
361-
// This loop takes advantage of the fact that the stack table is topologically ordered:
362-
// Prefix stacks are always visited before their descendants.
363-
for (let stackIndex = 0; stackIndex < stackTable.length; stackIndex++) {
364-
let selfAddress: Address | null = null;
365-
let totalAddresses: Set<Address> | null = null;
366-
const frame = stackTable.frame[stackIndex];
367-
368-
if (
369-
stackIndexToCallNodeIndex[stackIndex] === callNodeIndex &&
370-
frameTable.nativeSymbol[frame] === nativeSymbol
371-
) {
372-
// This stack contributes to the call node's self time for the right
373-
// native symbol. We needed to check both, because multiple stacks for the
374-
// same call node can have different native symbols.
375-
selfAddress = frameTable.address[frame];
376-
if (selfAddress !== -1) {
377-
totalAddresses = new Set([selfAddress]);
378-
}
379-
} else {
380-
// This stack does not map to the given call node or has the wrong native
381-
// symbol. So this stack contributes no self time to the call node for the
382-
// requested native symbol, and we leave selfAddress at null.
383-
// As for totalTime, this stack contributes to the same address's totalTime
384-
// as its parent stack: If it is a descendant of a stack X which maps to
385-
// the given call node, then it contributes to stack X's address's totalTime,
386-
// otherwise it contributes to no address's totalTime.
387-
// In the example above, this is how stack 8 contributes to call node 3's
388-
// totalTime.
389-
const prefixStack = stackTable.prefix[stackIndex];
390-
totalAddresses =
391-
prefixStack !== null
392-
? callNodeTotalAddressesForAllStacks[prefixStack]
393-
: null;
394-
}
395-
396-
callNodeSelfAddressForAllStacks.push(selfAddress);
397-
callNodeTotalAddressesForAllStacks.push(totalAddresses);
398-
}
399-
return {
400-
selfAddress: callNodeSelfAddressForAllStacks,
401-
stackAddresses: callNodeTotalAddressesForAllStacks,
402-
};
403-
}
404-
405-
/**
406-
* This handles the inverted case of getStackAddressInfoForCallNode.
407-
*
408-
* The returned StackAddressInfo is computed as follows:
409-
* selfAddress[stack]:
410-
* For (inverted thread) root stack nodes that map to the given call node
411-
* and whose stack.frame.nativeSymbol is the given symbol, this is stack.frame.address.
412-
* For (inverted thread) root stack nodes whose frame with a different symbol,
413-
* or which don't map to the given call node, this is null.
414-
* For (inverted thread) *non-root* stack nodes, this is the same as the selfAddress
415-
* of the stack's prefix. This way, the selfAddress is always inherited from the
416-
* subtree root.
417-
* stackAddresses[stack]:
418-
* For stacks that map to the given call node or one of its (inverted tree)
419-
* descendant call nodes, this is a set containing one element, which is
420-
* ancestorStack.frame.address, where ancestorStack maps to the given call
421-
* node.
422-
* For all other stacks, this is null.
423-
*/
424-
export function getStackAddressInfoForCallNodeInverted(
425-
stackTable: StackTable,
426-
frameTable: FrameTable,
427-
callNodeIndex: IndexIntoCallNodeTable,
428-
callNodeInfo: CallNodeInfoInverted,
429-
nativeSymbol: IndexIntoNativeSymbolTable
430-
): StackAddressInfo {
431-
const depth = callNodeInfo.depthForNode(callNodeIndex);
432-
const [rangeStart, rangeEnd] =
433-
callNodeInfo.getSuffixOrderIndexRangeForCallNode(callNodeIndex);
434-
const callNodeIsRootOfInvertedTree = callNodeInfo.isRoot(callNodeIndex);
435-
const stackIndexToCallNodeIndex =
436-
callNodeInfo.getStackIndexToNonInvertedCallNodeIndex();
437-
const stackTablePrefixCol = stackTable.prefix;
438-
const suffixOrderIndexes = callNodeInfo.getSuffixOrderIndexes();
439-
440-
// "self address" == "the address which a stack's self time is contributed to"
441-
const callNodeSelfAddressForAllStacks = [];
442-
// "total addresses" == "the set of addresses whose total time this stack contributes to"
443-
// Either null or a single-element set.
444-
const callNodeTotalAddressesForAllStacks = [];
445-
446-
for (let stackIndex = 0; stackIndex < stackTable.length; stackIndex++) {
447-
let selfAddress: Address | null = null;
448-
let totalAddresses: Set<Address> | null = null;
449-
450-
const stackForCallNode = getMatchingAncestorStackForInvertedCallNode(
451-
stackIndex,
452-
rangeStart,
453-
rangeEnd,
454-
suffixOrderIndexes,
455-
depth,
456-
stackIndexToCallNodeIndex,
457-
stackTablePrefixCol
458-
);
459-
if (stackForCallNode !== null) {
460-
const frameForCallNode = stackTable.frame[stackForCallNode];
461-
if (frameTable.nativeSymbol[frameForCallNode] === nativeSymbol) {
462-
// This stack contributes to the call node's total time for the right
463-
// native symbol. We needed to check both, because multiple stacks for the
464-
// same call node can have different native symbols.
465-
const address = frameTable.address[frameForCallNode];
466-
if (address !== -1) {
467-
totalAddresses = new Set([address]);
468-
if (callNodeIsRootOfInvertedTree) {
469-
// This is a root of the inverted tree, and it is the given
470-
// call node. That means that we have a self address.
471-
selfAddress = address;
472-
} else {
473-
// This is not a root stack node, so no self time is spent
474-
// in the given call node for this stack node.
475-
}
476-
}
477-
}
478-
}
479-
480-
callNodeSelfAddressForAllStacks.push(selfAddress);
481-
callNodeTotalAddressesForAllStacks.push(totalAddresses);
482-
}
483-
return {
484-
selfAddress: callNodeSelfAddressForAllStacks,
485-
stackAddresses: callNodeTotalAddressesForAllStacks,
486-
};
487-
}
488-
489179
// An AddressTimings instance without any hits.
490180
export const emptyAddressTimings: AddressTimings = {
491181
totalAddressHits: new Map(),
@@ -531,33 +221,49 @@ export function getAddressTimings(
531221
return { totalAddressHits, selfAddressHits };
532222
}
533223

534-
// Like getAddressTimings, but computes only the totalAddressHits.
535-
export function getTotalAddressTimings(
536-
stackAddressInfo: StackAddressInfo | null,
537-
samples: SamplesLikeTable
224+
// Returns the addresses which are hit within the specified native
225+
// symbol in a specific call node, along with the total of the
226+
// sample weights per address.
227+
// callNodeFramePerStack needs to be a mapping from stackIndex to the
228+
// corresponding frame in the call node of interest.
229+
export function getTotalAddressTimingsForCallNode(
230+
samples: SamplesLikeTable,
231+
callNodeFramePerStack: Int32Array,
232+
frameTable: FrameTable,
233+
nativeSymbol: IndexIntoNativeSymbolTable | null
538234
): Map<Address, number> {
539-
if (stackAddressInfo === null) {
235+
if (nativeSymbol === null) {
540236
return new Map<Address, number>();
541237
}
542-
const { stackAddresses } = stackAddressInfo;
543-
const totalAddressHits: Map<Address, number> = new Map();
544238

545-
// Iterate over all the samples, and aggregate the sample's weight into the
546-
// addresses which are hit by the sample's stack.
547-
// TODO: Maybe aggregate sample count per stack first, and then visit each stack only once?
239+
const totalPerAddress = new Map<Address, number>();
548240
for (let sampleIndex = 0; sampleIndex < samples.length; sampleIndex++) {
549-
const stackIndex = samples.stack[sampleIndex];
550-
if (stackIndex === null) {
241+
const stack = samples.stack[sampleIndex];
242+
if (stack === null) {
551243
continue;
552244
}
553-
const weight = samples.weight ? samples.weight[sampleIndex] : 1;
554-
const setOfHitAddresses = stackAddresses[stackIndex];
555-
if (setOfHitAddresses !== null) {
556-
for (const address of setOfHitAddresses) {
557-
const oldHitCount = totalAddressHits.get(address) ?? 0;
558-
totalAddressHits.set(address, oldHitCount + weight);
559-
}
245+
const callNodeFrame = callNodeFramePerStack[stack];
246+
if (callNodeFrame === -1) {
247+
// This sample does not contribute to the call node's total. Ignore.
248+
continue;
249+
}
250+
251+
if (frameTable.nativeSymbol[callNodeFrame] !== nativeSymbol) {
252+
continue;
253+
}
254+
255+
const address = frameTable.address[callNodeFrame];
256+
if (address === -1) {
257+
continue;
560258
}
259+
260+
const sampleWeight =
261+
samples.weight !== null ? samples.weight[sampleIndex] : 1;
262+
totalPerAddress.set(
263+
address,
264+
(totalPerAddress.get(address) ?? 0) + sampleWeight
265+
);
561266
}
562-
return totalAddressHits;
267+
268+
return totalPerAddress;
563269
}

0 commit comments

Comments
 (0)