Skip to content
15 changes: 15 additions & 0 deletions docs/drafts/types/git-documentation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Git purl type
The git purl type is a primitive type intended to be used as a base case for encoding information about a code base which exists under git version control.

## Definitions
namespace: The namespace for this purl type is defined as the url path to the git host.
eg. The `host` in git terminology.
https://git-scm.com/docs/git-clone.html#_git_urls

name: The name for this purl type is the path on the host to the codebase
eg. The `path-to-git-repo` in git terminology.
https://git-scm.com/docs/git-clone.html#_git_urls

version: The version for this type defines a specific point in the lineage of the codebase. Generally this is a commit or tag, but can be any valid git reference.
eg. The `git reference` in git terminology
https://git-scm.com/book/en/v2/Git-Internals-Git-References
1 change: 1 addition & 0 deletions purl-types-index.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
"docker",
"gem",
"generic",
"git",
"github",
"golang",
"hackage",
Expand Down
137 changes: 137 additions & 0 deletions tests/types/git-test.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
{
"$schema": "https://packageurl.org/schemas/purl-test.schema-0.1.json",
"tests": [
{
"description": "git namespace and name should be lowercased. Roundtrip an input purl to canonical.",
"test_group": "advanced",
"test_type": "roundtrip",
"input": "pkg:git/github/Package-url/purl-Spec@244fd47e07d1004f0aed9c",
"expected_output": "pkg:git/github/package-url/purl-spec@244fd47e07d1004f0aed9c",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "git namespace and name should be lowercased",
"test_group": "base",
"test_type": "parse",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_output": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": null
},
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "git namespace and name should be lowercased. Roundtrip a canonical input to canonical output.",
"test_group": "base",
"test_type": "roundtrip",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "git namespace and name should be lowercased",
"test_group": "base",
"test_type": "build",
"input": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": null
},
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Parse test for PURL type: git",
"test_group": "base",
"test_type": "parse",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_output": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": null
},
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Roundtrip test for PURL type: git",
"test_group": "base",
"test_type": "roundtrip",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Build test for PURL type: git",
"test_group": "base",
"test_type": "build",
"input": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": null
},
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Parse test for PURL type: git",
"test_group": "base",
"test_type": "parse",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md",
"expected_output": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": "options/locale_readme.md"
},
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Roundtrip test for PURL type: git",
"test_group": "base",
"test_type": "roundtrip",
"input": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md",
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md",
"expected_failure": false,
"expected_failure_reason": null
},
{
"description": "Build test for PURL type: git",
"test_group": "base",
"test_type": "build",
"input": {
"type": "git",
"namespace": "codeberg.org",
"name": "forgejo/forgejo",
"version": "a72d2c07cfca03b55371089de6aa230d8c951fa0",
"qualifiers": null,
"subpath": "options/locale_readme.md"
},
"expected_output": "pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md",
"expected_failure": false,
"expected_failure_reason": null
}
]
}
46 changes: 46 additions & 0 deletions types-doc/git-definition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
<!-- NOTE: Auto-generated from the JSON PURL type definition.
Do not manually edit this file. Edit the JSON type definition instead. -->

# PURL Type Definition: git

- **Type Name:** Git
- **Description:** Git-based source packages
- **Schema ID:** `https://packageurl.org/types/git-definition.json`

## PURL Syntax

The structure of a PURL for this package type is:

pkg:git/<namespace>/<name>@<version>?<qualifiers>#<subpath>

## Repository Information

- **Use Repository:** Yes
- **Note:** There is no default package repository, this should be implied from namespace.

## Namespace definition

- **Requirement:** Required
- **Case Sensitive:** Yes
- **Native Label:** The url path to the git host
- **Note:** `The source host for the git repository. See: https://git-scm.com/docs/git-clone.html#_git_urls`

## Name definition

- **Requirement:** Required
- **Case Sensitive:** Yes
- **Native Label:** repository name with owner
- **Note:** `The path on the host to the git repository. See: https://git-scm.com/docs/git-clone.html#_git_urls`

## Version definition

- **Requirement:** Optional
- **Native Label:** A git reference
- **Note:** `The version is a git reference (https://git-scm.com/book/en/v2/Git-Internals-Git-References). Ideally a commit or tag.`

## Examples

- `pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md`
- `pkg:git/cygwin.com/cgit/newlib-cygwin@6d049c54c3314da31d9ffac133a6a2f2dfecaac2`
- `pkg:git/projects.blender.org/blender/blender.git`
- `pkg:git/gitlab.gnome.org/GNOME/adwaita-fonts`
34 changes: 34 additions & 0 deletions types/git-definition.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
{
"$schema": "https://packageurl.org/schemas/purl-type-definition.schema-1.0.json",
"$id": "https://packageurl.org/types/git-definition.json",
"type": "git",
"type_name": "Git",
"description": "Git-based source packages",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would "source repositories" be clearer / more accurate than "source packages"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be. I was trying to stay with the package verbiage to align with the rest of the package url types, but I'm not opposed to swapping. @pombredanne do you have a preference one way or another?

"repository": {
"use_repository": true,
"note": "There is no default package repository, this should be implied from namespace."
},
"namespace_definition": {
"requirement": "required",
"case_sensitive": true,
"native_name": "The url path to the git host",
"note": "The source host for the git repository. See: https://git-scm.com/docs/git-clone.html#_git_urls"
},
"name_definition": {
"requirement": "required",
"case_sensitive": true,
"native_name": "repository name with owner",
"note": "The path on the host to the git repository. See: https://git-scm.com/docs/git-clone.html#_git_urls"
},
"version_definition": {
"requirement": "optional",
"native_name": "A git reference",
"note": "The version is a git reference (https://git-scm.com/book/en/v2/Git-Internals-Git-References). Ideally a commit or tag."
},
"examples": [
"pkg:git/codeberg.org/forgejo/forgejo/@a72d2c07cfca03b55371089de6aa230d8c951fa0#options/locale_readme.md",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to be exactly the (canonical) Git clone URL? If so, why is the .git suffix being omitted in this example?

Similarly, shouldn't a trailing / be omitted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough I can't find a primary source reference talking about the .git suffix. The best I was able to come across in searching is this stack overflow question where there is an assertion (without reference) that a trailing .git is a naming convention
https://stackoverflow.com/questions/8686691/what-does-the-git-mean-in-a-git-url
the primary git reference shows them off in examples but does not seem to explain them directly
https://git-scm.com/docs/git-clone.html#_git_urls

Testing locally (and also with the trailing /) I see some differences but they all come down to which url is populated in the local .git dir.
ex. when cloning this repo the three different ways I see

jon~/g/randos❯❯❯ diff test-1/.git/config test-2/.git/config
9c9
< 	url = https://github.com/package-url/purl-spec
---
> 	url = https://github.com/package-url/purl-spec/
...
jon~/g/randos❯❯❯ diff test-1/.git/config test-3/.git/config
9c9
< 	url = https://github.com/package-url/purl-spec
---
> 	url = https://github.com/package-url/purl-spec.git

For the purposes of identifying the actual code they all seem equivalent. I suppose there's a choice for use to deviate from what is allowed in the world of git and git tools and force a normalization or to align with the git world for ease of interoperability. Based on the conversation in #780 I believe the preference is to align with the norms of the git world.

That said, looking into the git url doc did make me think that the namespace and name definition should be altered slightly to align with the how git urls are defined upstream.
See: 88e820a

@andrew you don't have any insight into .git and trailing /s do you?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough I can't find a primary source reference talking about the .git suffix.

I believe its (conventional) use is simply hosting-platform-specific. I'm simply looking at the platform's "copy URL to clipboard" UI to see if that platform prefers to have a ".git" suffix in the URL or not.

So I agree that we can't have a general rule here, but for the scope of the examples, which name concrete platforms, IMO we should adhere to the preference of that platform.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the .git suffix comes from the original git-http-backend CGI script served repositories over HTTP by mapping the URL path directly to the filesystem.

As far as I know all the major forges support both with and without .git when cloning

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is commonly removed by forges, not Git itself, meaning it's not possible to tell for any given Git URL whether the .git is optional or not.

Forgejo: https://codeberg.org/forgejo/forgejo/src/commit/df79ccf7d8b69f63b7cb66d340e26ce1e3e79c89/routers/web/repo/githttp.go#L62
GitLab: https://gitlab.com/gitlab-org/gitlab/-/blob/358f46317b3613756ee6c471379d4cbdc33d9197/config/routes/git_http.rb#L58-68

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth would you like to propose an example to add/edit?

Well, any Git clone URL of a GitHub project would do, I guess, so for example:

pkg://git/github.com/oss-review-toolkit/ort.git@86c6b09b93db996689735d1eaabaa86a1051a319#cli/build.gradle.kts

Copy link
Copy Markdown
Member

@sschuberth sschuberth Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if we give such an example, we should probably make clear that this is not the "canonical form" of a GitHub PURL, and people should prefer pkg://github/... instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg:git/projects.blender.org/blender/blender.git and pkg:git/projects.blender.org/blender/blender are equivalent, but PURL implementations cannot be expected to know.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth how do you feel about the example (stolen from @matt-phylum) added in 8b1d9f0 ?

As for the advice about preferring the github type over the more primitive git type; totally agree. That's probably something to add to the human readable doc that @mjherzog mentioned. Maybe that's a follow up PR to add that though? I'm not sure there's a paved path for that doc yet (please correct me if I'm wrong).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sschuberth how do you feel about the example (stolen from @matt-phylum) added in 8b1d9f0 ?

Fine with me.

As for the advice about preferring the github type over the more primitive git type; totally agree. [...] Maybe that's a follow up PR to add that though?

Fine with me as well to do that as a follow-up; we just should not forget about giving that advice.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example uses the #, followed by a path to a file in the repository, but it doesn't look like this behavior is specified in this document.

Also, we likely ought to specify / explain in whatever documentation is made for this type that the path part must be stripped if you're going to git clone a repository.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed to expanding on the the # path behavior, but it is also shared with the github and bitbucket types and I think comes from the file_name qualifier here
https://github.com/package-url/purl-spec/blob/main/docs/common-qualifiers.md
I'll be honest, I built from the example of the github type which has that same behavior here
https://github.com/package-url/purl-spec/blob/main/types/github-definition.json#L30

The # character is not explicitly mentioned in the common qualifiers doc, so perhaps this is a spec level clarification rather than just for this type?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, good references. I suppose this goes to the purl folks for where information should be communicated. I'm in favor of being as explicit as possible (even if that means that the specs for other types need to change to be more detailed).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alilleybrinker In any case we need to start with documenting details outside of the <purl-type>.definition.json files that are based on the Schema included in ECMA-427 (Clause 6). Making schema level changes based on the most complex cases is problematic and we need many good examples before proposing schema changes.

There are 3 other places to put this information:

  • Test cases for a PURL type
  • PURL type background documentation
  • Specific PURL type information in the How to documentation

We have flexibility for adding new documentation and we now have an efficient process for publishing documentation at www.packageurl.org.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, with a little help I found the reference @alilleybrinker. I was looking for a file specifier and the term I should have been looking for was subpath. So, I believe we're good on this being documented as normal construction
https://packageurl.org/docs/purl/specification#subpath

"pkg:git/cygwin.com/cgit/newlib-cygwin@6d049c54c3314da31d9ffac133a6a2f2dfecaac2",
"pkg:git/projects.blender.org/blender/blender.git",
"pkg:git/gitlab.gnome.org/GNOME/adwaita-fonts"
]
}