Skip to content

GH-49473: [Python] Fix get_include and get_library_dirs to work with both editable and non-editable builds#49476

Open
raulcd wants to merge 1 commit intoapache:mainfrom
raulcd:GH-49473
Open

GH-49473: [Python] Fix get_include and get_library_dirs to work with both editable and non-editable builds#49476
raulcd wants to merge 1 commit intoapache:mainfrom
raulcd:GH-49473

Conversation

@raulcd
Copy link
Member

@raulcd raulcd commented Mar 9, 2026

Rationale for this change

With scikit-build-core our Cython tests fail to find PyArrow C++ headers due to get_include returning a non-existent directory.

What changes are included in this PR?

Editable builds location of libraries and headers are located at ``

Previous to this change:

$ python -c "import pyarrow as pa; print(pa.get_include())"
/home/raulcd/code/arrow/python/pyarrow/include
$ ls /home/raulcd/code/arrow/python/pyarrow/include
ls: cannot access '/home/raulcd/code/arrow/python/pyarrow/include': No such file or directory

with this change:

$ python -c "import pyarrow as pa; print(pa.get_include())"
/home/raulcd/code/pyarrow-dev/lib/python3.13/site-packages/pyarrow/include
$ ls /home/raulcd/code/pyarrow-dev/lib/python3.13/site-packages/pyarrow/include
arrow

Are these changes tested?

Yes, locally for editable installs and on CI for non-editable.

Are there any user-facing changes?

The only change is that headers, generated Cython cpp files and built shared libraries are installed on the virtualenv.

… with both editable and non-editable installs
@raulcd
Copy link
Member Author

raulcd commented Mar 9, 2026

@pitrou this solves the issue but I am wondering whether it is ok for those files to be where they are at the moment. Can you validate whether it works for you (editable on conda environment?)
I can see numpy does something "similar" but they don't use scikit-build-core they use meson-py now.
https://github.com/numpy/numpy/blob/fb8157f2d60d57ce460f5cf5d23f792f60affd3d/numpy/lib/_utils_impl.py#L78-L120

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

It "solves" the issue of test_cython but it breaks the request to make an editable install.
libarrow_python.so and include/arrow/python should be created in the editable install tree, not in the conda environment.

Right now it gives me a "hybrid" install, which is going to be error-prone:

>>> pa.__file__
'/home/antoine/arrow/dev/python/pyarrow/__init__.py'
>>> pa.lib.__file__
'/home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/pyarrow/lib.cpython-310-x86_64-linux-gnu.so'

(I actually don't understand how it works at all)

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

Looking at our build logs, I think what we're seeing is that scikit-build-core's interpretation of "editable install" is very unusual:

  *** Making editable...
  *** Created pyarrow-24.0.0.dev201+gdbbf7cf813-cp310-cp310-linux_x86_64.whl
  Building editable for pyarrow (pyproject.toml) ... done
  Created wheel for pyarrow: filename=pyarrow-24.0.0.dev201+gdbbf7cf813-cp310-cp310-linux_x86_64.whl size=30098932 sha256=0e64b9e184c670cdcecbc53c43bc216814576a667b1f40145384eff78de64e64
  Stored in directory: /tmp/pip-ephem-wheel-cache-_2s9ketp/wheels/f6/3f/0e/648fce17d996bb2435861ec788c2bcd92a3dcf6469364f0bcc
Successfully built pyarrow
Installing collected packages: pyarrow
  Attempting uninstall: pyarrow
    Found existing installation: pyarrow 24.0.0.dev201+g6e82bd4c64
    Uninstalling pyarrow-24.0.0.dev201+g6e82bd4c64:
      Created temporary directory: /tmp/pip-uninstall-y4w2nmtd
      Removing file or directory /home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/__pycache__/_pyarrow_editable.cpython-310.pyc
      Created temporary directory: /tmp/pip-uninstall-6eo67r61
      Removing file or directory /home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/_pyarrow_editable.pth
      Removing file or directory /home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/_pyarrow_editable.py
      Created temporary directory: /home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/~yarrow-24.0.0.dev201+g6e82bd4c64.dist-info
      Removing file or directory /home/antoine/mambaforge/envs/pyarrow/lib/python3.10/site-packages/pyarrow-24.0.0.dev201+g6e82bd4c64.dist-info/
      Successfully uninstalled pyarrow-24.0.0.dev201+g6e82bd4c64

Successfully installed pyarrow-24.0.0.dev201+gdbbf7cf813

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

So perhaps this PR is the right way to do it, given scikit-build-core's own strategy.

Also, scikit-build-core admittedly lists editable install support as "experimental".

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

Oh, actually it's configurable:
https://scikit-build-core.readthedocs.io/en/latest/reference/configs.html#editable

@henryiii explains it a bit more here.

Let me try with "inplace".

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

Ok, "inplace" does not seem to work at all as nothing is copied to the editable dir, even Python extension modules such as lib.cpython-310-x86_64-linux-gnu.so.

So we should probably stick with the default editable strategy for now.

@pitrou
Copy link
Member

pitrou commented Mar 9, 2026

Or, according to scikit-build/scikit-build-core#1141 (comment) 👍

If you want inplace to work, then you should move the CMake code to build to src/spiceypy/cyice/CMakeLists.txt. With an inplace build, you are not installing, so items remain where they are defined.

Well, I don't know what we should later do, but this PR is obviously simpler and solves the issue for now...

@raulcd raulcd marked this pull request as ready for review March 10, 2026 11:28
@raulcd raulcd requested review from AlenkaF and rok as code owners March 10, 2026 11:28
@raulcd
Copy link
Member Author

raulcd commented Mar 10, 2026

Or, according to scikit-build/scikit-build-core#1141 (comment) 👍

If you want inplace to work, then you should move the CMake code to build to src/spiceypy/cyice/CMakeLists.txt. With an inplace build, you are not installing, so items remain where they are defined.

I don't really understand the above, I am not sure I understand where our CMake code should live so the cpp/so files are not moved to the site-packages folder and remain in source.

@pitrou
Copy link
Member

pitrou commented Mar 10, 2026

Yes, I don't really understand it either. We should probably live with the current solution.

@rok
Copy link
Member

rok commented Mar 10, 2026

I consulted with 🤖 and perhaps we should switch to _lib.__file__ in create_library_symlinks and get_library_dirs too.
Here's also some proposed tests that monkeypatch pa.__file__ and pa.lib.__file__ to different directories, verifying that get_include(), get_library_dirs(), and the win32 pyarrow.libs path all resolve from the extension location.
See rok@34d8bf3

@raulcd
Copy link
Member Author

raulcd commented Mar 10, 2026

I consulted with 🤖 and perhaps we should switch to _lib.__file__ in create_library_symlinks and get_library_dirs too.

I am unsure about that, the problem happens for editable builds where scikit-build-core seems to be pushing the files to python site-packages instead of in the source tree. create_library_symlinks seems to be only necessary to be executed when building extensions and pyarrow has been installed from wheels (not editable is expected here). The files on that case should be at the current location.
https://arrow.apache.org/docs/python/integration/extending.html#building-extensions-against-pypi-wheels
We might have to revisit whether create_library_symlinks is still required to build extensions (might not be relevant anymore).

As per the other change on get_library_dirs, I also don't think it is required, those changes are for Windows delvewheel embedded msvcp140.dll and for bundled Arrow C++ those should be at the requested location. Editable builds don't apply for those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants