Issue Details

Number
27754
Title
`coinselector_tests` use non-representative CoinSelectionParams and UTXO costs
Description
### Is there an existing issue for this? - [X] I have searched the existing issues ### Current behaviour Most test cases in the `coinselector_tests` have `effective_feerate`, `long_term_feerate`, `discard_feerate`, and `cost_of_change` in the corresponding CoinSelectionParams set to 0 which is not accepted during production use of the wallet with the default configuration—we require transactions to at least pay `minRelayTxFee` (1 ṩ/vB). Under a minimum feerate of 1 ṩ/vB, a (standard) change output will always incur a `cost_of_change` of at least 31 ṩ. Similarly, the `add_coin()` functions uses values for the UTXOs that are not encountered in production use. The UTXOs are created with `feerate`, `input_bytes`, `custom_size`, and `fees` set to 0. Inputs can never be smaller than 41 bytes, and thus must incur a fee of at least 41 ṩ during production use. ![image](https://github.com/bitcoin/bitcoin/assets/4060799/eb2b286b-502c-4f62-9185-7d9120290ae3) The main purpose of the coinselector_tests is to test that coin selection behaves as expected for production use. While it is also necessary to test edge cases and outliers, the strange values for CoinSelectionParams and UTXOs cause coin selection to behave differently than in production which means that production behavior is not well-reflected in the tests. It is also confusing to developers trying to reason about coin selection behavior. E.g. our coin selection works by creating multiple input set candidates and then picking the one that scores lowest on “waste”. In the case of multiple input sets tying on waste, we pick the one using more inputs. When `cost_of_change` is set to 0, our waste calculation surmises that we created a changeless transaction and any overselection will be dropped to fees. This is incorrect when we are actually comparing input sets produced by coin selection algorithms that produce change (e.g. SRD vs Knapsack in the `check_max_weight` test case) which then has an unexpected outcome. ### Expected behaviour Coin selection tests should be tested with values encountered during production. - We should test feerates from the range of commonly encountered feerates: e.g. `[1000, 5000, 40_000, 500_000]`. As edge cases we may also want to have some tests for feerate=0 (permitted by consensus but non-standard), feerate=2_000_000 (should trigger extreme feerate warning). - Long_term_feerate should be set to the default value of 10_000 except in cases where it is explicit focus of the tests - discard_feerate should be set to 3000 except in cases where it is explicit focus of the tests - UTXOs should be created using representative conditions for the `coinselector_tests`. Preferably they would be calculated from the feerate in the CSParams and a representative input weight. Inputs should be assumed to weigh between ~57.5–148 vB by default, perhaps with an edge cases testing for 41 vB (minimum input size). Inputs should incur fees of at least 58 ṩ (P2TR at 1 ṩ/vB). - Change should be assumed to cost fees of at least 31 ṩ (P2WPKH at 1 ṩ/vB). The CoinSelectionParams should be set up in the `SETUP` for the test cases once, and only CSP that are explicitly relevant to the test should be amended in the individual test cases. The `add_coin()` functions should use values corresponding to the test case’s CSP, unless they are explicitly set differently to test something. ### Steps to reproduce I was looking at `coinselector_tests.cpp` and came upon the `check_max_weight` test case. In Scenario 1 there are 1515 coins of 0.033 ₿ and one coin of 50 ₿. A solution only composed of the smaller coins would exceed the allowed weight, and it is checking whether a valid solution is found. What surprised me was that the expected outcome was supposed to be that only one input is picked; the 50 ₿ input. This was surprising to me because SRD now uses a heap to kick out the lowest value UTXOs if it exceeds the allowed max_weight, so I would expect SRD always to find a solution if there is one. Since the test sets long-term-feerate, and feerate to 0, and both a lowest_larger solution per Knapsack and a solution with many inputs per SRD would have change, I would expect their waste to be just the cost_of_change which should be the same for either (and at feerate of 0 should be zero). So I would expect both input sets to tie on waste. ChooseSelection breaks ties on waste by preferring the input set with more inputs. Yet, the test consistently picks the single input solution by knapsack while I would have expected it to pick the SRD solution because they should be tied for waste and the latter uses more inputs. So, I added some print statements to `ChooseSelection` and learned that the input sets in this test _do have non-zero_ waste. I had a hard time explaining where the waste score for these input sets came from until I realized that our transaction building assumes that no change is created when `cost_of_change` is set to 0, and thus any overshoot during the selection is presumed to be excess and added to the waste score. At production feerates and costs the SRD solution would be preferred, because both input sets produce change and the SRD solution would produce a lower waste score. ### What version of Bitcoin Core are you using? master@fa53611cf1b2ab2f49470ba75ee29a94a7b89105
URL
https://github.com/bitcoin/bitcoin/issue/27754
Closed by
Back to List