Conversation
| RepoFullName | ||
| : The full name of the Git repository (in the format of {"owner/repo"}). |
There was a problem hiding this comment.
The GitHub API defines this as nameWithOwner: https://docs.github.com/en/graphql/reference/objects#repository
There was a problem hiding this comment.
They also use full_name which is what I stumbled upon first 🤔
https://docs.github.com/en/rest/apps/installations?apiVersion=2022-11-28
I also don't want to overindex on github specifically. "owner" or prefixes is not a universal concept (I guess I should update and say "in the format of {"owner/repo"}, if applicable" or something)
fwiw:
- GitLab: Uses path_with_namespace (https://docs.gitlab.com/api/projects/)
- Bitbucket: Uses either full_name or slug, can't quite tell (https://developer.atlassian.com/cloud/bitbucket/rest/api-group-repositories/)
There was a problem hiding this comment.
Sounds good 👍 Thanks for finding those. Agree about not making it too much GH specific 👍
| : The full name of the Git repository (in the format of {"owner/repo"}). | ||
|
|
||
| Version | ||
| : A positive integer that increments each time the document body changes. |
There was a problem hiding this comment.
Any reason not to use the git sha? I guess it's keeping the cardinality lower? But on the other hand version requires maintaining that state somewhere (probably the TD registry?)
There was a problem hiding this comment.
(1) (arguably) less human friendly
that implies that FQONs become something like:
Foo:bar:baz:f7ea691e89ac59e10cdd58700c833158b74ec607
which includes a hash, which is kinda what we were trying to avoid - we want something totally human friendly for folks to copy/paste and avoid using the op name instead.
(2) footgun - uniqueness concerns?
Git shas can be abbreviated and still resolve to a full SHA if enough bytes are available to make it unambiguous. Slight concern folks might (mistakenly) apply this here, leading to multiple pointers to the same operation, making such that there is no longer a single unambiguous FQON for an operation.
as much as the spec could outlaw it, these could all feasibly map to the same operation:
Foo:bar:baz:f7ea691e89ac59e10cdd58700c833158b74ec607
Foo:bar:baz:f7ea691e89ac59e10cdd58700c833158b74
Foo:bar:baz:f7ea691e89ac59e1
Foo:bar:baz:f7ea6
(Foo:bar:baz:f7ea6 is kinda nice actually...)
but the footgun is that without unique keys in a config file, users might mistakenly use truncated hashes which may or may not conflict with other fqons that point to the same operation
(3) less semantic information encoded
Foo:bar:baz:3
tells you this is the 3rd version, and a list of FQONs for the same operation can be ordered using FQONs alone
All that said, i'm not super against using git shas if folks feel strongly.
My vote is still version, but i'm curious what others think.
There was a problem hiding this comment.
This all makes sense. I like version too. My main follow up question is whether it's something the client should be aware of. In which case it needs to be returned when registrating the TDs. But we can probably start with this being recorded by the operation registry 👍
There was a problem hiding this comment.
You can also calculate the version number without storing it
logic extracted from our dashboard:
/**
* Keep track of how many times we've inserted a "version" of each document
*/
const collectionInsertCount = new Map();
allDocuments.forEach((document) => {
// Look up the collection id for this document
const collectionHash = `${document.name}|${document.repository}|${document.submittedProject}`;
const collectionId = collectionMap.get(collectionHash);
assert(typeof collectionId === 'number', 'expected document collection to exist');
const insertCount = documentCollectionCount.get(collectionHash);
assert(typeof insertCount === 'number', 'expected document collection to exist');
// Calculate the "version" of this document in the collection by counting how
// many times we've already inserted rows into this collection.
//
// (allDocuments is pre-sorted by date so this should be stable. We're
// keeping track of this in memory to avoid a db lookup per insert.)
const version = insertCount + 1;
insertDocument.run(
document.hash,
collectionId,
version,
document.submittedTime,
document.submittedBy,
document.document,
document.lastUsedInProd,
document.gitSha,
document.sourceFile,
);
documentCollectionCount.set(collectionHash, insertCount + 1);
});we since switched to actually storing version as an attribute in the registry since it's more stable and simplifies lookup logic.
|
|
||
| ## Definition | ||
|
|
||
| FullyQualifiedOperationName :: OperationName : Project? : RepoFullName : Version |
There was a problem hiding this comment.
Having Project before RepoFullName breaks my mental model a little bit. I understand Project is for monorepos that may have several clients? Then I would prefer something more like so:
GetFoo:bazcorp/qux/barpkg:1
So effectively, more like
FullyQualifiedOperationName :: OperationName : RepoOwner / RepoName [/ Subsystem]? : Version
This way it goes from general to specific.
Actually, thinking about this, why not even put the operation name at the end?
bazcorp/qux/barpkg:GetFoo:1
For someone using Maven Coordinates a lot, this would look a lot more familiar to me
There was a problem hiding this comment.
++ yeah this ordering is a little weird, and an artifact of how we deployed this internally.
I wanted in theory for it to be ordered like a uri "in order of decreasing significance from left to right"
So that would imply
Repo : Project : OperationName : Version
Which is the same ordering as your bazcorp/qux/barpkg:GetFoo:1.
...however, when doing this internally, I wanted operation name to appear first, so I just yanked it 🤷.
Motivation is that when displayed in logs (grafana or splunk or whatever) or dashboard panels, a long FQON is likely to get truncated in the UI. By putting the op name first, there's more a guarantee you'll still see a difference in the value between rows for different operations. So I think there's some value and practicality here - but yes it does feel kinda weird as a pure uri.
Is that tradeoff worth it? Vote now!
There was a problem hiding this comment.
There was a similar discussion in the graphq-over-http working group.
For URLs, the last component of the path is highlighted/promoted by a lot of tooling like chrome devtools, etc...
Whether or not that applies here is an open question since those are not really part of a url. I'm not too familiar with grafana or splunk or whatever but aren't there any display options for custom fields?
Previously discussed here graphql/graphql-spec#1185