Skip to content

Rust: Add predicate for certain type information #20155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

paldepind
Copy link
Contributor

@paldepind paldepind commented Aug 2, 2025

Motivated by performance problems seen in #20140 this PR introduces a new "phase" to type inference in the form of a new inferCertainType predicate.

Motivation

There's a lot of places where we approximate things in type inference. These inaccuracies cause us to sometimes infer multiple types for the same term. Inaccurate types can lead to other inaccurate types and things can spiral and blow up from there. In the worst case we end up with cycles in type inference.

In #20140 performance blew up in Sway at a place where several cycles in inference happened. This PR adds a reduced version of that problem, which was caused by a type coercion that we don't handle. I don't see an easy way to handle such coercions correctly, and more generally think it'll be hard to make our entire type inference 100% correct.

Proposed solution

This PR adds a new "phase" to type inference, which is just a new predicate called inferCertainType. The idea is that

  • inferCertainType contains a subset of the type inference that is sure to be 100% correct. This implies that inferCertainType (ideally) never contains multiple types for the same node and path.
  • inferType uses (and is a superset) of inferCertainType, but not the other way around. This means that we can use inferCertainType negated in inferType to eliminate or block inaccuracies when we actually have accurate information.

This PR implements a minimum viable version of the idea which is sufficient to fix the cycle in Sway (and the test). We infer certain type information only for 1/ annotated terms and 2/ very simple calls and let the information propagate along a few equalities. The PR also uses a negated inferCertainType to block inferTypeEquality from letting (wrong) type information flow to a node where we have complete correct type information. This is what cuts the cycle in the example.

Pros of this approach:

  • We get a separation in the code between inaccurate and accurate inference rules.
  • We can contain wrongly inferred types from propagating and prevent cycles.
  • The above could also potentially result in better performance. We might also be able to guard expensive type inference with a not inferCertainType(n, _), i.e., if the simple and correct rules give us a type we can stop inference there.

Cons:

  • The architecture is slightly more complicated.
  • There is a bit of duplication in the certain and the uncertain part of type inference.

Results on #20140

With this PR the number of nodes with a type at the type path lenght limit in Sway is reduced from 76 and to 63.

When quick eval'ing inferType on Sway I see the following:

nr of tuples time DCA
main 2,404,901 63s
#20140 5,458,859 107s DCA
#20140 + this PR 2.414.244 70s DCA

So this PR seems to remove the blowup caused by #20140 for Sway.

Future work

As mentioned this is just a minimum viable implementation.

  • We should be able to move more stuff into inferCertainType.

  • Right now inferCertainType(n, path) is only allowed to give results if we have complete information of the entire type tree for n. It would be nice to lift this requirement s.t. it is allowed to have results as long as the type specifically at path is certain.

    This will make it much easier to soundly include things in inferCertainType. For instance, then we can have a certain root type for Some(a) without necessarily having a certain type for the nested type.

    However, we still need to know when we have complete certain information of a node (to implement the block in inferTypeEquality). Ideally that could be derived from inferCertainType by checking if the type tree is complete, but that is more work to implement.

@github-actions github-actions bot added the Rust Pull requests that update Rust code label Aug 2, 2025
@paldepind paldepind changed the title Rust/type inference certain Rust: Add predicate for certain type information Aug 2, 2025
@paldepind paldepind force-pushed the rust/type-inference-certain branch 2 times, most recently from 21f2c08 to f5f7b61 Compare August 3, 2025 12:45
@paldepind paldepind marked this pull request as ready for review August 4, 2025 05:46
@paldepind paldepind requested a review from a team as a code owner August 4, 2025 05:46
@Copilot Copilot AI review requested due to automatic review settings August 4, 2025 05:46
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new "phase" to Rust type inference by adding an inferCertainType predicate that contains only type inference rules guaranteed to be 100% correct. This addresses performance issues caused by cycles in type inference that can occur when inaccurate approximations lead to multiple types being inferred for the same term.

Key changes include:

  • Implementation of inferCertainType predicate for accurate type inference on annotated terms and simple calls
  • Integration of certain type information into the main inferType predicate
  • Prevention of inaccurate type propagation when certain information is available

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
rust/ql/lib/codeql/rust/internal/TypeInference.qll Adds CertainTypeInference module and integrates it into main type inference
rust/ql/lib/codeql/rust/internal/TypeInferenceConsistency.qll Adds consistency check for non-unique certain types
rust/ql/test/library-tests/type-inference/dereference.rs Adds test case demonstrating the cycle-causing scenario
rust/ql/test/library-tests/type-inference/main.rs Updates test expectation comment
*.expected files Updates test expectations reflecting improved type inference results
Comments suppressed due to low confidence (3)

rust/ql/lib/codeql/rust/internal/TypeInference.qll:1

  • The predicate typeMentionIsComplete should have a documentation comment explaining what constitutes a 'complete' type mention and why this check is important for certain type inference.
/** Provides functionality for inferring types. */

rust/ql/lib/codeql/rust/internal/TypeInference.qll:1

  • This complex predicate should have comprehensive documentation explaining its purpose, the rationale behind each constraint, and providing examples of calls that would and wouldn't be considered 'certain'.
/** Provides functionality for inferring types. */

rust/ql/lib/codeql/rust/internal/TypeInference.qll:1

  • This guard condition is critical for preventing cycles. The comment should be expanded to explain how this prevents the specific cycle described in the PR description and why it's safe to block all uncertain type propagation when certain information exists.
/** Provides functionality for inferring types. */

@paldepind paldepind added the no-change-note-required This PR does not need a change note label Aug 4, 2025
@paldepind paldepind force-pushed the rust/type-inference-certain branch from f5f7b61 to 3ba285c Compare August 4, 2025 12:06
@paldepind
Copy link
Contributor Author

The last DCA report contains the new "Nodes With Type At Length Limit" metric: https://github.com/github/codeql-dca-main/blob/data/paldepind/PR-20155-0-rust__2/reports/readme.md#nodes-with-type-at-length-limit. The changes look pretty good, especially on Databend (though overall analysis time is still down a bit).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-change-note-required This PR does not need a change note Rust Pull requests that update Rust code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant