You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

9.2 KiB

Project overview

Title

Enable Building of gRPC Python with Bazel

Overview

gRPC Python currently has a constellation of scripts written to build the project, but it has a lot of limitations in terms of speed and maintainability. Bazel is the open-sourced variant of Google's internal system, Blaze, which is an ideal replacement for building such projects in a fast and declarative fashion. But Bazel in itself is still in active development, especially in terms of Python (amongst a few other languages).

The project aimed to fill this gap and build gRPC Python with Bazel.

Project page

Link to proposal

Thoughts and challenges

State of Bazel for Python

Although previously speculated, the project didn't require any contributions directly to bazelbuild/bazel. The Bazel rules for Python are currently being separated out into their own repo at bazelbuild/rules_python.

Bazel is still very much in active development for Python though. There's still challenges when it comes to building for Python 2 vs 3. Using pip packages is still in experimental. Bazel Python support is currently distributed across these two repositories and is yet to begin migration to one place (which will be bazelbuild/rules_python).

Bazel's roadmap for Python is publicly available here as a Google doc.

Cross collaboration between projects

Cross contribution surprisingly came up because of building protobuf sources for Python, which is still not natively supported by Bazel. An existing repository, pubref/rules_protobuf, which was maintained by an independent maintainer (i.e. not a part of Bazel) helped solve this problem, but had one major blocking issue and could not be resolved at the source. But a solution to the issue was proposed by user dududko, which was not merged because of failing golang tests but worked well for Python. Hence, a fork of this repo was made and is to be used with gRPC until the solution can be merged back at the source.

Building Cython code

Building Cython code is still not supported by Bazel, but the team at cython/cython have added support for Bazel on their side. The way it works is by including Cython as a third-party Bazel dependency and using custom Bazel rules for building our Cython code using the binary within the dependency.

Packaging Python code using Bazel

pip and PyPI still remain the de-facto standard for distributing Python packages. Although Bazel is pretty versatile and is amazing for it's reproducible and incremental build capabilities, these can only be still used by the contributors and developers for building and testing the gRPC code. But there's no way yet to build Python packages for distribution.

Building gRPC Python with Bazel on Kokoro (internal CI)

Integration with the internal CI was one of the areas that highlighted how simple Bazel can be to use. gRPC was already using a dockerized Bazel setup to build some of it's core code (but not as the primary build setup). Adding a new job on the internal CI ended up being as simple as creating a new shell script to install the required dependencies (which were python-dev and Bazel) and a new configuration file which pointed to the subdirectiory (src/python) under which to look for targets and run the tests accordingly.

Handling imports in Python code

When writing Python packages, imports in nested modules are typically made relative to the package root. But because of the way Bazel works, these paths wouldn't make sense from the Workspace root. So, the folks at Bazel have added a nifty imports parameter to all the Python rules which lets us specify for each target, which path to consider as the root. This parameter allows for relative paths like imports = ["../",].

Fetching Python headers for Cython code to use

Cython code makes use of Python.h, which pulls in the Python API for C extension modules to use, but it's location depending on the Python version and operating system the code is building on. To make this easier, the folks at Tensorflow wrote repository rules for Python autoconfiguration. This has been adapted with some some modifications for use in gRPC Python as well.

How to use

All the Bazel tests for gRPC Python can be run using a single command:

bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/...

If any specific test is to be run, like say LoggingPoolTest (which is present in src/python/grpcio_tests/tests/unit/framework/foundation/_logging_pool_test.py), the command to run would be:

bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/grpcio_tests/tests/unit/framework/foundation:logging_pool_test

where, logging_pool_test is the name of the Bazel target for this test.

Similarly, to run a particular method, use:

bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/grpcio_tests/tests/unit/_rpc_test --test_arg=RPCTest.testUnrecognizedMethod

Useful Bazel flags

  • Use bazel build with a -s flag to see the logs being printed out to standard output while building.
  • Similarly, use bazel test with a --test_output=streamed to see the the test logs while testing. Something to know while using this flag is that all tests will be run locally, without sharding, one at a time.

Contributions

  • 435c6f8 Update grpc_gevent cython files to include .pxi
  • 74426fd Add gevent_util.h to grpc_base_c Bazel target
  • b6518af Upgrade Bazel to 0.15.0
  • ebcf04d Kokoro setup for building gRPC Python with Bazel
  • 3af1aaa Basic setup to build gRPC Python with Bazel
  • 11f199e Workspace changes to build gRPC Python with Bazel
  • 848fd9d Minimal Bazel BUILD files for grpcio Python

Other contibutions

  • 89ce16b Update Dockerfiles for python artifacts to use latest git version
  • 32f7c48 Revert removals from python artifact dockerfiles
  • 712eb9f Make logging after success in jobset more apparent
  • c6e4372 Create README for gRPC Python reflection package
  • 2e113ca Update logging in Python to use module-level logger

Pending PRs

  • BUILD files for all tests in tests.json.
  • BUILD files for gRPC testing, gRPC health checking, gRPC reflection.
  • (Yet to complete) BUILD files for grpcio_tools. One test depends on this.

Known issues

  • grpc/grpc #16336 RuntimeError for _reconnect_test Python unit test with Bazel
  • Some tests in Bazel pass despite throwing an exception. Example: testAbortedStreamStream in src/python/grpcio_tests/tests/unit/_metadata_code_details_test.py.
  • #14557 introduced a minor bug where the module level loggers don't initialize a default logging handler.
  • Sanity test doesn't make sense in the context of Bazel, and thus fails.
  • There are some issues with Python2 vs Python3. Specifically,
    • On some machines, “cygrpc.so: undefined symbol: _Py_FalseStruct” error shows up. This is because of incorrect Python version being used to build Cython.
    • Some external packages like enum34 throw errors when used with Python 3 and some extra packages are currently installed as Python version in current build scripts. For now, the extra packages are added to a requirements.bazel.txt file in the repository root.