-
Notifications
You must be signed in to change notification settings - Fork 19
Description
I've built a library in rust that wraps fred (https://docs.rs/fred/latest/fred/) to provide a fast and stable redis library that we extensively use. Unfortunately I'm experiencing a noticeable performance degradation after migrating from pyo3-asyncio 0.20 and 0.21 to pyo3-async-runtimes 0.22 (and notice the same issue up to and including 0.26). I'm not extremely proficient in rust, so I'm wondering if it's something I did while rewriting the functions to 0.21 or if it's something more fundamental. fred uses tokio in case that makes a difference.
Here's an example of a function where the performance is noticeably lower:
Using pyo3-asyncio 0.20:
pub fn ping<'a>(&self, py: Python<'a>) -> PyResult<&'a PyAny> {
let client = self.client.clone();
future_into_py(py, async move {
client.ping::<()>().await.map_err(PyRedisError::from)?;
Ok(())
})
}
I adjusted this to the following to make it compile using pyo3-asyncio-0-21 which still performs the same as the 0.20 version:
pub fn ping<'a>(&self, py: Python<'a>) -> PyResult<Bound<'a, PyAny>> {
let client = self.client.clone();
future_into_py(py, async move {
client.ping::<()>().await.map_err(PyRedisError::from)?;
Ok(())
})
}
I've also tried a different variant which I came across in some other project but it shows the same performance degradation when using pyo3-async-runtime 0.22+:
pub fn ping(&self, py: Python) -> PyResult<Py<PyAny>> {
let client = self.client.clone();
let a = future_into_py(py, async move {
client.ping::<()>().await.map_err(PyRedisError::from)?;
Ok(())
})?;
Ok(a.into_pyobject(py)?.into_any().into())
}
I've created a miniaturized version of my library here: https://github.com/iksteen/pyo3-performance. There's 7 branches: master
, 0.21-bound
, 0.22-bound
, 0.23-bound
, 0.24-bound
, 0.24-into_pyobject
and 0.26-bound
.
I use this script to benchmark all the variants:
for b in master 0.21-bound 0.22-bound 0.23-bound 0.24-bound 0.24-into_pyobject 0.26-bound; do
echo $b
git checkout -q $b
hyperfine --setup 'maturin develop --release' --warmup 1 'python bench.py'
done | tee benchmark.txt
master (pyo3-asyncio 0.20, returning PyResult<&PyAny>)
Time (mean ± σ): 2.717 s ± 0.047 s [User: 3.115 s, System: 1.487 s]
Range (min … max): 2.656 s … 2.797 s 10 runs
0.21-bound (pyo3-asyncio-0-21 0.21, returning PyResult<Bound<PyAny>>)
Time (mean ± σ): 2.679 s ± 0.045 s [User: 3.067 s, System: 1.501 s]
Range (min … max): 2.621 s … 2.778 s 10 runs
0.22-bound (pyo3-async-runtimes 0.22, returning PyResult<Bound<PyAny>>)
Time (mean ± σ): 3.137 s ± 0.034 s [User: 3.543 s, System: 2.106 s]
Range (min … max): 3.094 s … 3.184 s 10 runs
0.23-bound (pyo3-async-runtimes 0.23, returning PyResult<Bound<PyAny>>)
Time (mean ± σ): 3.128 s ± 0.032 s [User: 3.563 s, System: 2.066 s]
Range (min … max): 3.078 s … 3.171 s 10 runs
0.24-bound (pyo3-async-runtimes 0.24, returning PyResult<Bound<PyAny>>)
Time (mean ± σ): 3.091 s ± 0.040 s [User: 3.474 s, System: 2.085 s]
Range (min … max): 3.014 s … 3.141 s 10 runs
0.24-into_pyobject (pyo3-async-runtimes 0.24, returning PyResult<Py<PyAny>>)
Time (mean ± σ): 3.104 s ± 0.022 s [User: 3.485 s, System: 2.087 s]
Range (min … max): 3.082 s … 3.150 s 10 runs
0.26-bound (pyo3-async-runtimes 0.26, returning PyResult<Bound<PyAny>>)
Time (mean ± σ): 3.096 s ± 0.044 s [User: 3.503 s, System: 2.061 s]
Range (min … max): 3.045 s … 3.174 s 10 runs
Is this something that I did wrong when updating to 0.22, or has the library just become significantly (~14% in this benchmark) slower?