Commit Graph

4756 Commits

Author SHA1 Message Date
Craig Tiller
a8b31fab81 Refactor CallTracer API for SendInitialMetadata.
This change introduces a new experiment `call_tracer_send_initial_metadata_is_an_annotation`. When enabled, the `CallTracer::RecordSendInitialMetadata` method will now record a `SendInitialMetadataAnnotation` and call a new `MutateSendInitialMetadata` method on the underlying `CallTracerInterface`.

The `CallTracerInterface` and its implementations (including XorMetrics, OpenCensus, OpenTelemetry, and test fakes) have been updated to include the new `MutateSendInitialMetadata` virtual method. The existing `RecordSendInitialMetadata` implementations are modified to check the experiment flag and delegate to `MutateSendInitialMetadata` if the experiment is active.

A new `SendInitialMetadataAnnotation` class is added, which inherits from `CallTracerAnnotationInterface::Annotation`. This annotation type is used to capture the state of the initial metadata for immutable tracing purposes.

Additionally, `ForEachKeyValue` methods are added to `MetadataInfo` and `HttpAnnotation` to facilitate iterating over metadata key-value pairs for annotation recording. The experiment configuration files are updated to include the new experiment.

PiperOrigin-RevId: 854227812
2026-01-09 09:47:03 -08:00
Alisha Nanda
d3afb84946 Rename HttpConnectHandshaker to HttpConnectClientHandshaker in preparation for adding a HttpConnectServerHandshaker.
PiperOrigin-RevId: 852382782
2026-01-05 11:24:22 -08:00
Mark D. Roth
dabda5fea8 [xDS] update version of xDS protos (#41242)
This temporarily disables the bzlmod version consistency check, because the new version of the xDS protos winds up pulling in a lot of upgraded dependencies that will take some work to get working.

Closes #41242

PiperOrigin-RevId: 852345420
2026-01-05 10:03:24 -08:00
Tanvi Jagtap - Google LLC
d7509f9913 [PH2][BUILD][Trivial] Build files (#41301)
[PH2][BUILD][Trivial] Build files

Closes #41301

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/41301 from tanvi-jagtap:ph2_more_files 8b396d0f4b5cbb7559eebc4612cecb4ce0506361
PiperOrigin-RevId: 848476035
2025-12-24 01:34:53 -08:00
Michael Lumish
522dbbbb25 [Release] Bump version to 1.79.0-dev (on master branch) (#41291)
Change was **not** created by the release automation script, because it doesn't handle a +2 version bump. See go/grpc-release

Closes #41291

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/41291 from murgatroid99:v1.79.0-dev_bump 9a9bf54e5a891459390792dc9d547bdc17b7dd4d
PiperOrigin-RevId: 848168598
2025-12-23 07:26:31 -08:00
Mark D. Roth
0a6901dcd9 [python xDS protos] move to a shallower directory (#41261)
This will avoid exceeding the Windows 150-character path length limit when we upgrade the xDS protos.

Closes #41261

PiperOrigin-RevId: 847800132
2025-12-22 10:00:34 -08:00
Tanvi Jagtap - Google LLC
560e95a3f4 [PH2][Trivial][BUILD] Adding new files (#41286)
[PH2][Trivial][BUILD] Adding new files

Closes #41286

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/41286 from tanvi-jagtap:new_files_01 81fcc2e4244b07758454e89c70ec666254e56e80
PiperOrigin-RevId: 847787819
2025-12-22 09:14:16 -08:00
Kai-Hsun Chen
ddbfe03ab7 [python] aio: fix race condition causing asyncio.run() to hang forever during the shutdown process (#40989)
# Root cause
* gRPC AIO creates a Unix domain socket pair, and the current thread passes the read socket to the event loop for reading, while the write socket is passed to a thread for polling events and writing a byte into the socket.
* However, during the shutdown process, the event loop stops reading the read socket without closing it before the polling thread receives the final event to exit the thread.
* The shutdown process will hang if (1) the event loop stops reading the read socket before the polling thread receives the final event to exit the thread, and (2) the polling process stuck at `write` syscall.
  * The `write` syscall may get stuck at [sock_alloc_send_pskb](https://elixir.bootlin.com/linux/v5.15/source/net/core/sock.c#L2463) when there is not enough socket buffer space for the write socket. Hence, the polling thread hangs at write and cannot continue to the next iteration to retrieve the final event. As a result, the event loop no longer reads the read socket, so the allocable buffer size for the write socket does not increase any longer. Therefore, the current thread hangs when waiting for the polling thread to `join()`.
* `asyncio` will shutdown the default executor (`ThreadPoolExecutor`) when `asyncio.run(...)` finishes. Hence, it hangs because some threads can't join.

# Reproduction

* Step 0: Reduce the socket buffer size to increase the probability to reproduce the issue.
   ```sh
   sysctl -w net.core.rmem_default=8192
   sysctl -w net.core.rmem_default=8192
   ```
* Step 1: Manually update `unistd.write(fd, b'1', 1)` to `unistd.write(fd, b'1' * 4096, 4096)`. The goal is to make write (4096 bytes per write) faster than read (1 byte per read), thereby filling the write buffer nearly full.
8e67cb088d/src/python/grpcio/grpc/_cython/_cygrpc/aio/completion_queue.pyx.pxi (L31)

* Step 2: Create an `aio.insecure_channel` and use it to send 100 requests with at most 10 in-flight requests. After all requests finish, the shutdown process will be triggered, and it's highly likely to hang if you follow Steps 0 and 1 correctly. In my case, my reproduction script reproduces the issue 10 out of 10 times.

* Step 3: If it hangs, check the following information:
  * `ss -xpnm state connected | grep $PID` => You will find there are two sockets that belong to the same socket pair, and one has non-zero bytes in the read buffer while the other has non-zero bytes in the write buffer. In addition, write buffer should be close to `net.core.rmem_default`.
  * Check the stack of the `_poller_thread` by running `cat /proc/$PID/task/$TID/stack`. The thread is stuck at `sock_alloc_send_pskb` because there is not enough buffer space to finish the `write` syscall.
  * Use GDB to find the `_poller_thread` and make sure it's stuck at `write()`, then print its `$rdi` to confirm that the FD is the one with a non-zero write buffer in the socket.

# Test

Follow Steps 0, 1, and 2 in the 'Reproduction' section with this PR. It doesn't hang in 10 out of 10 cases.

<!--

If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the appropriate
lang label.

-->

Closes #40989

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40989 from kevin85421:asyncio-hang ff74508a2c29e7c71dfe88365d1178f901d69787
PiperOrigin-RevId: 846425459
2025-12-18 14:58:03 -08:00
Rishesh Agarwal
16c24a2004 Adding layering_check and parse_headers in each bazel src/python build file
PiperOrigin-RevId: 840512315
2025-12-04 20:08:16 -08:00
siddharth nohria
c4b87395ce Server Wide Max Outstanding Streams: Add Build changes (#41076)
Allow servers to set max outstanding streams limit per server. This pull request only adds the BUILD changes required for this. The core logic will follow in a later PR.

Closes #41076

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/41076 from siddharthnohria:max_outstanding_streams 392d962fc78be66c075952977bc3a28f2298b7ce
PiperOrigin-RevId: 833196338
2025-11-17 00:16:50 -08:00
Mark D. Roth
ccd635ea24 [client channel] refactor v1 call buffering code (#40945)
This refactors the call buffering code for the v1 stack, which avoids some repetition between the resolver queue and the LB pick queue.  This code will also be used in the future in the subchannel as part of implementing the MAX_CONCURRENT_STREAMS design.

As part of this, I also eliminated the subclassing in the v1 client channel implementation, which has not been necessary since the v2 code was removed.

Closes #40945

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40945 from markdroth:call_buffer_v1_refactoring 0a471be6ed862c3cc3225644fb2a3e1456e60fbf
PiperOrigin-RevId: 829566551
2025-11-07 13:57:35 -08:00
Sreenithi Sridharan
d8698ff717 [Python] Migrate to pyproject.toml build system (#40833)
Fixes #40744.

Closes #40833

PiperOrigin-RevId: 826625632
2025-10-31 14:19:57 -07:00
Chandra Shekhar Sirimala
62ce3fc3d6 [Python] Log error details when ExecuteBatchError occurs (at DEBUG level) (#40921)
Log error details when `ExecuteBatchError` occurs. The error is logged with DEBUG severity level. Message example:

> `Failed to receive any message from Core: Failed grpc_call_start_batch: 8 with grpc_call_error value: 'GRPC_CALL_ERROR_TOO_MANY_OPERATIONS`

Closes #40921

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40921 from chandra-siri:log_exception_details 316de3dae2f1935051e4358d793ee14f727505fd
PiperOrigin-RevId: 823347363
2025-10-23 22:20:39 -07:00
Sreenithi Sridharan
bca7762d1c [Python] Update setuptools min version to 77.0.1 (#40931)
This PR updates the minimum version of `setuptools` package required across different Python setup files to v77.0.1. This version contains Python 3.14 support as well as deprecates a format for defining project license in `pyproject.toml` files ([Reference](https://setuptools.pypa.io/en/stable/history.html#id71)) which is a prerequisite for #40833

Closes #40931

PiperOrigin-RevId: 823008815
2025-10-23 06:15:35 -07:00
Craig Tiller
88ed3efe6e Refactor CollectionScope to support multiple parents.
This change modifies `grpc_core::CollectionScope` to accept a vector of parent scopes instead of a single parent. This allows a single scope to aggregate its metrics into multiple parent scopes. Consequently, the special `GetGlobalCollectionScope` function and its associated global state are removed, as a root scope can now be created using `CreateCollectionScope({}, {})`.

PiperOrigin-RevId: 822316366
2025-10-21 16:15:36 -07:00
Luwei Ge
4cb3850cec [security][Python] migrate ssl_channel_credentials to use advanced tls API (#40878)
Switch to use the new [TLS Credentials API](https://github.com/grpc/proposal/pull/422) in Python's SSL channel credentials API.

Closes #40878

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40878 from rockspore:tls-py 148f5950522caa74178f62ed86d69086133e51c6
PiperOrigin-RevId: 819820924
2025-10-15 10:43:33 -07:00
Sreenithi Sridharan
3f19be492d [Fix][Python] Add parameters for CreateCollectionScope() in gRPC Observability (#40909)
https://github.com/grpc/grpc/pull/40851 Introduced a change with mandatory parameters for `CreateCollectionScope()`.
While gRPC Python observability doesn't use it for functionality, it is invoked in `observability_util.cc` to force linking the instrument package and avoid build errors.

The change in #40851 to include has started causing Python tests to fail with the following error:
```
grpc_observability/observability_util.cc:98:36: error: too few arguments to function ‘grpc_core::RefCountedPtr grpc_core::CreateCollectionScope(grpc_core::RefCountedPtr, absl::lts_20250512::Span >, size_t, size_t)’
   98 |   grpc_core::CreateCollectionScope();  // Forces linking of instrument library
      |                                    ^
In file included from grpc_root/src/core/lib/resource_quota/telemetry.h:18,
                 from grpc_root/src/core/lib/resource_quota/memory_quota.h:44,
                 from grpc_root/src/core/lib/resource_quota/arena.h:39,
                 from grpc_root/src/core/lib/promise/arena_promise.h:29,
                 from grpc_root/src/core/lib/channel/channel_stack.h:58,
                 from grpc_observability/python_observability_context.h:33,
                 from grpc_observability/observability_util.h:31,
                 from grpc_observability/observability_util.cc:15:
```

This PR hence adds empty parameters to fix the breakage.

Closes #40909

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40909 from sreenithi:fix_observability_40851_breakage 4b9b7b23cfde3c97b5eef7cebeb5f81ce29d2402
PiperOrigin-RevId: 819690667
2025-10-15 04:44:28 -07:00
Craig Tiller
339906443b [clang-format] Match include file ordering to internal clang-format (#40905)
gRPC is currently getting formatted with two different clang-format implementations, and due to some weirdness they have different include file orderings. This change introduces clang-format configuration to ensure that the two systems align - it's *highly* expected that this will need some maintenance going forward as the two systems evolve.

Closes #40905

PiperOrigin-RevId: 819606209
2025-10-15 00:24:11 -07:00
Sergii Tkachenko
74538378b4 [Fix] make_grpcio_observability.py removeprefix fix (#40898)
Some CI jobs is still running it with Python 3.8

Closes #40898

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40898 from sergiitk:fix/ci/py-o11y ecd146a7454f490cac1ae9229c92ed4d059fd0d3
PiperOrigin-RevId: 818787222
2025-10-13 12:46:19 -07:00
Zgoda91
37a60d0392 [Python] Enable UPB code for o11y build (#40789)
This change include:
1. Restore changes from https://github.com/grpc/grpc/pull/40652
2. Fix for UPB library dependencies for o11y module in Python

<!--

If you know who should review your pull request, please assign it to that
person, otherwise the pull request would get assigned randomly.

If your pull request is for a specific language, please add the appropriate
lang label.

-->

Closes #40789

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40789 from Zgoda91:python_o11y_lib_deps_fix 2bad9b6f61579c0fdaa3b405b727bf77d58e64ec
PiperOrigin-RevId: 817980787
2025-10-11 03:25:07 -07:00
Ashesh Vidyut
c7cf9cab81 [Python][Part 15] - Introducing Ruff (#40190)
### Description

## Part 15 of Introducing Ruff

* In this PR - the suppression for `TRY002` and `TRY004` has been removed on the root `ruff.toml`

## Related:
* Prev: #40189
* b/423755915

Closes #40190

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40190 from asheshvidyut:feature/setup-ruff-part-15 9e1c4c2712a052e5c5e38f415df8ec5e5d398a2e
PiperOrigin-RevId: 817159922
2025-10-09 06:37:33 -07:00
Ashesh Vidyut
5fa8962ac8 [Python][Part 14] - Introducing Ruff (#40189)
### Description

## Part 14 of Introducing Ruff

* In this PR - the suppression for `SIM103`, `SIM108`, `SIM114`, `SIM115`, `SIM117`, `SIM118`, `SIM300`, `T210`, `TC003`, `TD004` and `TD005` has been removed on the root `ruff.toml`

## Related:
* Next: #40190
* Prev: #40188
* b/423755915

Closes #40189

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40189 from asheshvidyut:feature/setup-ruff-part-14 39b25d291bed497322666129b7e96ec18846c35e
PiperOrigin-RevId: 816636222
2025-10-08 04:04:39 -07:00
Ashesh Vidyut
ed56044081 [Python][Part 13] - Introducing Ruff (#40188)
### Description

## Part 13 of Introducing Ruff

* In this PR - the suppression for `S311`, `S603` and `SIM102` has been removed on the root `ruff.toml`

## Related:
* Next: #40189
* Prev: #40187
* b/423755915

Closes #40188

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40188 from asheshvidyut:feature/setup-ruff-part-13 76f21dc6006b14ed6e611904868eb78a0c55d325
PiperOrigin-RevId: 816589294
2025-10-08 01:28:49 -07:00
ac-patel
1e315f3ff2 [PH2] Add goaway dependency to client transport (#40845)
Closes #40845

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40845 from ac-patel:goaway1 ab7c3165a249be4d00aec602a01b858ebdebe0c6
PiperOrigin-RevId: 816099297
2025-10-07 01:49:01 -07:00
Ashesh Vidyut
deb0be5648 [Python][Part 12] - Introducing Ruff (#40187)
### Description

## Part 12 of Introducing Ruff

* In this PR - the suppression for  `RUF013`, `RUF022`, `RUF023`, `S301` has been removed on the root `ruff.toml`

## Related:
* Next: #40188
* Prev: #40186
* b/423755915

Closes #40187

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40187 from asheshvidyut:feature/setup-ruff-part-12 c9120f9e979a7f60a487504b60cd8d91fbdf7123
PiperOrigin-RevId: 814551463
2025-10-02 23:20:48 -07:00
Ashesh Vidyut
e227774f94 [Python][Part 11] - Introducing Ruff (#40186)
### Description

## Part 11 of Introducing Ruff

* In this PR - the suppression for  `RET506` and `RET507` has been removed on the root `ruff.toml`

## Related:
* Next: #40187
* Prev: #40185
* b/423755915

Closes #40186

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40186 from asheshvidyut:feature/setup-ruff-part-11 16a0cb463a1fb11a0e940e6116e1daf8965cf7ce
PiperOrigin-RevId: 813056941
2025-09-29 20:28:20 -07:00
Sergii Tkachenko
02d82e4094 [Release] Bump version to 1.77.0-dev (on master branch) (#40796)
Change was created by the release automation script. See go/grpc-release.

Closes #40796

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40796 from sergiitk:bump_dev_version_202509291139 e7aa910253d1706a72822da986b8b8e7bc87931d
PiperOrigin-RevId: 812961524
2025-09-29 15:16:18 -07:00
Sergii Tkachenko
9438af796e [Python] Fork test: fix exit code output, configure better logging (#40740)
A minor fix.

Python `_fork_interop_test` was logging out `waitstatus` instead of the exit code.

> [`os.wait()`](https://docs.python.org/3.9/library/os.html#os.wait) \
> Wait for completion of a child process, and return a tuple containing its pid and exit status indication: a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status (if the signal number is zero); the high bit of the low byte is set if a core file was produced.
>
> [`waitstatus_to_exitcode()`](https://docs.python.org/3.9/library/os.html#os.waitstatus_to_exitcode) can be used to convert the exit status into an exit code.

This PR:
- Logs exit code and the wait status, with a clear distinction what's what.
- Configures log format to be absl-like, just but prefixes thread id with the pid (which is more relevant in our case).

Closes #40740

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40740 from sergiitk:py/test/fork/exit-code 04951ba96c435f710bdfc8c3bd6605ff0fdf8f4e
PiperOrigin-RevId: 811856945
2025-09-26 10:06:50 -07:00
Craig Tiller
e5062bcce8 [resource-quota] Revert #39125 (#40768)
Closes #40768

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40768 from ctiller:rollback 37692d6a818e080c25063991fe867837b22ff0a6
PiperOrigin-RevId: 811517736
2025-09-25 15:10:38 -07:00
Ashesh Vidyut
f4ed0100c6 [Python][Part 10] - Introducing Ruff (#40185)
### Description

## Part 10 of Introducing Ruff

* In this PR - the suppression for  `RET505` has been removed on the root `ruff.toml`

## Related:
* Next: #40186
* Prev: #40184
* b/423755915

Closes #40185

PiperOrigin-RevId: 810694059
2025-09-23 20:58:50 -07:00
Ashesh Vidyut
6666cd9f28 [Python][Part 8] - Introducing Ruff (#40183)
### Description

## Part 8 of Introducing Ruff

* In this PR - the suppression for `PLR1714` and `PLR5501` and  has been removed on the root `ruff.toml`

## Related:
* Next: #40184
* Prev: #40182
* b/423755915

Closes #40183

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40183 from asheshvidyut:feature/setup-ruff-part-8 af5f5967d96c9e82d280bd66744bcf1ae972e7ec
PiperOrigin-RevId: 810411168
2025-09-23 06:09:52 -07:00
Ashesh Vidyut
d0f1de2ea2 [Python][Part 7] - Introducing Ruff (#40182)
### Description

## Part 7 of Introducing Ruff

* In this PR - the suppression for `N806`, `PERF102`, `PIE796`, `PERF401`, `PLC0206`, `PYI032`, `PYI045`, `PYI056`, `PLE0604`, `PLR0911`, `PLR0915`, `PLR1704` and `PLR1711` and  has been removed on the root `ruff.toml`

## Related:
* Next: #40183
* Prev: #40181
* b/423755915

Closes #40182

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40182 from asheshvidyut:feature/setup-ruff-part-7 ead903f59f22c9b207a4afd0d043a796464f1322
PiperOrigin-RevId: 810376538
2025-09-23 03:54:35 -07:00
Ashesh Vidyut
f895b6e694 [Python][Part 9] - Introducing Ruff (#40184)
### Description

## Part 9 of Introducing Ruff

* In this PR - the suppression for  `PLW0120`, `PLW0603`, `PLW1508`, `PLW1641`, `PT009`, and `PTH123` has been removed on the root `ruff.toml`

## Related:
* Next: #40185
* Prev: #40183
* b/423755915

Closes #40184

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40184 from asheshvidyut:feature/setup-ruff-part-9 ea6648bd71d0e9e5af7db3e06c6e3f85c6f5a35c
PiperOrigin-RevId: 810316723
2025-09-23 00:13:07 -07:00
Sergii Tkachenko
6019becfa8 [Python] Handle python3.14 get_event_loop behavior changes (#40750)
> [!IMPORTANT]
> **This fix is only needed to support the use of grpc.aio outside of a running event loop.** This approach is [strongly discouraged](https://docs.python.org/3.14/library/asyncio-policy.html#asyncio-policies) by Python, and will be deprecated in future gRPC releases.
>
> Please use the [asyncio.run()](https://docs.python.org/3.14/library/asyncio-runner.html#asyncio.run) function or the [asyncio.Runner](https://docs.python.org/3.14/library/asyncio-runner.html#asyncio.Runner). If you see this in Python REPL, use the dedicated [asyncio REPL](https://docs.python.org/3/library/asyncio.html#asyncio-cli) by running `python -m asyncio`.

This PR handles the following [asyncio behavioral changes](https://docs.python.org/3.14/whatsnew/3.14.html#id11) introduced in python3.14:
- [asyncio.get_event_loop()](https://docs.python.org/3.14/library/asyncio-eventloop.html#asyncio.get_event_loop) now raises a `RuntimeError` if there is no current event loop, and no longer implicitly creates an event loop.
- [Deprecations](https://docs.python.org/3.14/whatsnew/3.14.html#deprecated): [asyncio.get_event_loop_policy()](https://docs.python.org/3.14/library/asyncio-policy.html#asyncio.get_event_loop_policy), [asyncio.AbstractEventLoopPolicy](https://docs.python.org/3.14/library/asyncio-policy.html#asyncio.AbstractEventLoopPolicy), [asyncio.DefaultEventLoopPolicy](https://docs.python.org/3.14/library/asyncio-policy.html#asyncio.DefaultEventLoopPolicy). Note that this PR preserves the existing behavior and does not migrate off of [asyncio policy system](https://docs.python.org/3.14/library/asyncio-policy.html) yet. This will be done separately, see #39518.

To support 3.14, this PR:
1. Fixes #39507 by handling the call to deprecated `asyncio.get_event_loop_policy()` when calling `new_event_loop()`.  This was necessary because all warnings were elevated to errors within the context manager, and the newly deprecated policy caused an unhandled exception.
2. Handles the `BaseDefaultEventLoopPolicy.get_event_loop()` behavior change. [Before](https://github.com/python/cpython/blob/v3.13.7/Lib/asyncio/events.py#L695) python 3.14, it would only throw `RuntimeError` when there's no loop in non-main threads. [After](https://github.com/python/cpython/blob/v3.14.0rc3/Lib/asyncio/events.py#L714) python 3.14, it removes the special handling of the main thread. This PR preserves preserves the pre-3.14 grpc.aio behavior for 3.14.

Closes #40750

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40750 from sergiitk:fix/py/get_working_loop c4561debe5be3fda70177e060344f08fe0849852
PiperOrigin-RevId: 810291532
2025-09-22 22:29:22 -07:00
siddharth nohria
aed2f7989f [Resource Quota] Add Server wide Stream Quota (#39125)
Add Stream quota, to allow users to set server wide max_outstanding_streams, in addition to the per-connection limit.

Closes #39125

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/39125 from siddharthnohria:max_outstanding_streams 32ae21514d5321a76b41b8445d16753a095914f8
PiperOrigin-RevId: 807985441
2025-09-16 22:20:22 -07:00
Adam Heller
f5ffef4d6b [test] Add PostMortem dumps on CHECK failures in test builds (#39945)
See `grpc_check.h`. This code  redefines the abseil `CHECK*` macros using custom gRPC macros when building tests. In `bazel test ...` builds, on check failure, `PostMortemEmit()` will dump state to the log before crashing.

Caveat: to prevent circular dependencies, code that `postmortem` relies on cannot use the custom gRPC CHECK macros. This is not much code, ~50 source files. grep for the `absl/log:check` bazel dependency.

Closes #39945

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/39945 from drfloob:grpc_check ca8e46718f2021e0df79aa67a3a0b0c751b3ce44
PiperOrigin-RevId: 807452496
2025-09-15 17:43:19 -07:00
siddharth nohria
af9837c932 [Core] Resource Tracking (#40698)
Introduce a centralized Resource Tracking mechanism in gRPC core, to provide a centralized way to access job-level resources.

There are multiple features in gRPC which can benefit from having better visibility into the job-level resource usage.
* Debuggability: Knowing that the Client / Server was experiencing high CPU usage at the time of some request can serve as a valuable insight for debugging poor latencies / failures.

* Load Shedding: gRPC’s ResourceQuota currently depends upon users defining limits, and only track gRPC channel-level usage. Configuring this can be difficult at times, especially if the application level usage for different requests varies significantly. In addition, visibility into Container memory usage can allow us to enable ResourceQuota by default in the future.

Closes #40698

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40698 from siddharthnohria:container-memory b058a0ed7ef801fdd0be2bfc04e1a481f0908a5d
PiperOrigin-RevId: 807142322
2025-09-15 02:02:24 -07:00
Craig Tiller
c93531aefc Automated rollback of commit bc949f8fb2.
PiperOrigin-RevId: 807098531
2025-09-14 23:26:02 -07:00
Craig Tiller
bc949f8fb2 Integrate telemetry domains with channelz.
This change adds `MetricsDomainNode` and `MetricsDomainStorageNode` to channelz. `QueryableDomain` and `DomainStorage` now inherit from `channelz::DataSource`, exposing domain metadata, labels, and metric values through channelz.

PiperOrigin-RevId: 806081893
2025-09-11 19:46:42 -07:00
Sreenithi Sridharan
b931270b88 Revert "[Python] Fix Python observability test breakage (#40481)" (#40662)
This reverts PR #40481

#40481 was work in progress and doesn't fix the mentioned observability test yet. It was wrongly submitted due to an earlier approval.

Closes #40662

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40662 from sreenithi:revert_wrong_commit bc8f55c2ed6c5fb8220f37827a43bb83a7d6511d
PiperOrigin-RevId: 805405106
2025-09-10 10:02:08 -07:00
Sreenithi Sridharan
5d81e88ae3 [Python] Fix Python observability test breakage (#40481)
#40417 has caused the Python Basic Tests job to fail continuously since this morning with the error
```
from grpc_observability import _open_telemetry_observability
  File "/var/local/git/grpc/py39/lib/python3.9/site-packages/grpc_observability/_open_telemetry_observability.py", line 23, in <module>
    from grpc_observability import _cyobservability
ImportError: /var/local/git/grpc/py39/lib/python3.9/site-packages/grpc_observability/_[cyobservability.cpython-39-x86_64-linux-gnu.so](https://www.google.com/url?q=http://cyobservability.cpython-39-x86_64-linux-gnu.so&sa=D): undefined symbol: _ZN9grpc_core17instrument_detail15QueryableDomain19AllocateDoubleGaugeESt17basic_string_viewIcSt11char_traitsIcEES5_S5_
```

This PR fixes it by adding the required dependency in grpcio-python-observability.

Passing Basic Tests Run using this fix:
https://btx.cloud.google.com/invocations/239da2ca-1394-42f7-aa5c-ac63b938b200/targets

Closes #40481

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40481 from sreenithi:fix_basic_test_breakage 124d6da56ccca7eae92a64d911ab6eb31f05175f
PiperOrigin-RevId: 805276991
2025-09-10 02:52:44 -07:00
Sergii Tkachenko
9e2283db67 [Python] aio: skip grpc/aio shutdown if py interpreter is finalizing (#40447)
This PR changes the logic of `shutdown_grpc_aio` to skip `_actual_aio_shutdown` python interpreter is already [being finalized](https://docs.python.org/3.14/glossary.html#term-interpreter-shutdown) (cleaning up resources, destroying objects, preparing for program exit, etc). `_actual_aio_shutdown` involves `PollerCompletionQueue` shutdown, followed by core [`grpc_shutdown`](https://grpc.github.io/grpc/core/grpc_8h.html#a35f55253e80714c17f4f3a0657e06f1b) API call.

Reasoning:
1. During finalizations, in come cases resources we're accessing may already be freed, and the order is not deterministic. Some of the resources being unloaded prior the  `_actual_aio_shutdown` call: `_global_aio_state`, `AsyncIOEngine` enum, or even python libraries like `sys`. This leads to errors like `AttributeError: 'NoneType' object has no attribute 'POLLER'`.
2. `PollerCompletionQueue.shutdown()` will try to wait on its poller thread to finish gracefully. In py3.14, `PythonFinalizationError` is raised when `Thread.join()` is called during finalization. I think the logic here is similar to (1): these threads may have already been deallocated.

Note that in some cases users were able to prevent `_actual_aio_shutdown` from being called by manually calling `init_grpc_aio` prior to initializing any grpc objects.  This resulted in an incorrect positive refcount, which prevents `_actual_aio_shutdown` from being run. Before the above finalization check was added this side-effect was sometimes misused to avoid deadlock on finialization (#22365).

This PR:
- Fixes #39520
- Fixes #22365
- Fixes #38679
- Fixes #33342
- Fixes #36655

Closes #40447

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40447 from sergiitk:fix/aio/shutdown 11114f6feffd7380e9fde56e7581eb19cf001597
PiperOrigin-RevId: 804971756
2025-09-09 10:40:10 -07:00
Eugene Ostroukhov
7949731064 [event_engine] Introduce a event_engine_poller_for_python experiment (#40243)
This commit introduces the `event_engine_poller_for_python` experiment, allowing for controlled rollout and testing of the EventEngine's Posix poller specifically within gRPC Python bindings.

**Crucially, this change does *not* alter the default behavior for builds where `GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER` is *not* defined.** In such configurations, the Posix EventEngine poller will continue to be enabled unconditionally, preserving existing functionality.

The primary impact of this change is when `GRPC_DO_NOT_INSTANTIATE_POSIX_POLLER` *is* defined (e.g., in gRPC Python's build system). In this scenario, the enablement of the Posix poller transitions from being implicitly disabled to being configurable via the `event_engine_poller_for_python` experiment flag. This enables a controlled, experimental rollout of the EventEngine poller in environments that previously opted out of its direct instantiation.

Closes #40243

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40243 from eugeneo:python-no-backup-poller-experiment 17906e6501b8e6fe7d2ccc63439ec121ef47d43b
PiperOrigin-RevId: 804472655
2025-09-08 09:52:13 -07:00
Sreenithi Sridharan
ee5325b9f1 [Python][Support 3.14] Enable 3.14 in Python Basic, Bazel and Distrib tests (#40403)
This PR enables Python 3.14 in all the different tests - Basic tests (Native Python tests), Bazel tests and Distrib tests to build Python 3.14 artifacts. In addition, it also updates all the public facing METADATA versions.

## Distribtests
Required pre-requisite changes to enable 3.14 artifacts are covered in #40289 .

## Bazel tests
Enabling Python 3.14 required updating the rules_python version to a more recent version that supports 3.14. This was done in #40602

## Basic tests
The following errors were caught by the Basic tests when running via Python 3.14 and resolved in this PR:

### 1) No running event loop for AsyncIO when run outside an async function
```
Traceback (most recent call last):
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/common.pyx.pxi", line 184, in grpc._cython.cygrpc.get_working_loop
RuntimeError: no running event loop
```
This was caught by the `tests_aio.unit.outside_init_test.TestOutsideInit` and `tests_aio.unit.init_test.TestInit` tests, and was also previously reported in #39507 with the root cause.

Following some investigation, the fix is being worked on by @sergiitk  in PR #40293. In order to parallelize the fix and this PR, these 2 tests are currently being skipped for Python 3.14 and above.

### 2) Pickling error from the `multiprocessing` library
```
_pickle.PicklingError: Can't pickle <function _test_well_known_types at 0x7f3937eee610>: it's not the same object as tests.unit._dynamic_stubs_test._test_well_known_types
when serializing dict item '_target'
when serializing multiprocessing.context.Process state
when serializing multiprocessing.context.Process object
```
This was caught by the `tests.unit._dynamic_stubs_test.DynamicStubTest` which runs test cases in a subprocess using the `multiprocessing` library.
Error root cause:
- The default start method of multiprocessing in linux has changed to `forkserver` instead of `fork` from Python 3.14.
- `forkserver` has a few extra restrictions for picklability as compared to `fork` (Ref: [Python Docs](https://docs.python.org/3.14/library/multiprocessing.html#the-spawn-and-forkserver-start-methods))
- All the [test case functions](0243842d5d/src/python/grpcio_tests/tests/unit/_dynamic_stubs_test.py (L115)) in the DynamicStubTest that are provided as `target` to the `multiprocessing.Process` use decorators. This causes problems when pickling them.

Hence to resolve this, we manually set the 'start method' of `multiprocessing` to use the `fork` start method.

Closes #40403

PiperOrigin-RevId: 804290760
2025-09-07 23:58:12 -07:00
Craig Tiller
34611992c3 Convert ztrace to proto
PiperOrigin-RevId: 802878974
2025-09-03 23:40:59 -07:00
Pawan Bhardwaj
fd1d6e0911 [xDS] Matcher API implementation (#39877)
Proto : https://github.com/cncf/xds/blob/main/xds/type/matcher/v3/matcher.proto
Will follow up with  more concrete classes as per usage. Added some basic to test e2e parse

Closes #39877

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/39877 from pawbhard:matcher e87d7ce6584c32a71a5610781d16e46cb78841d6
PiperOrigin-RevId: 802661365
2025-09-03 12:31:55 -07:00
Craig Tiller
30f993b1e8 Migrate Resource Quota telemetry to a dedicated InstrumentDomain.
Roll forward of a prior change, this change includes fixes and also memory reclamation support for orphaned domain storage.

PiperOrigin-RevId: 802204833
2025-09-02 10:44:54 -07:00
apolcyn
73c0f8ac9c [release] Bump dev version on to 1.76.0-dev (#40484)
As title

Closes #40484

COPYBARA_INTEGRATE_REVIEW=https://github.com/grpc/grpc/pull/40484 from apolcyn:bump_dev_version_202508191952 e788be57e9dc7f5e8316bee4baadec26fba3f6e6
PiperOrigin-RevId: 798331971
2025-08-22 14:01:19 -07:00
Craig Tiller
fcf69fba84 Automated rollback of commit 56976b0b3c.
PiperOrigin-RevId: 796900113
2025-08-19 09:13:22 -07:00
Craig Tiller
56976b0b3c Migrate Resource Quota telemetry to a dedicated InstrumentDomain.
This change introduces a new `ResourceQuotaDomain` and registers Resource Quota related counters (`rq_calls_dropped`, `rq_calls_rejected`, `rq_connections_dropped`) within this domain. Each `MemoryQuota` now holds a reference to a `ResourceQuotaDomain` Storage instance, allowing these metrics to be tracked per resource quota. The usage sites in `chttp2_transport.cc` and `parsing.cc` are updated to use the new per-quota telemetry storage. The old global stats definitions for these counters are removed.

Introduce gauges also, and use them to report current memory pressure.

PiperOrigin-RevId: 796613444
2025-08-18 16:08:16 -07:00