Skip to content

[PEP 747] Recognize TypeForm[T] type and values (#9773) #19596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

davidfstr
Copy link
Contributor

@davidfstr davidfstr commented Aug 5, 2025

(This PR replaces an earlier draft of the same feature: #18690 )

Feedback from @JukkaL integrated since the last PR, by commit title:

  • Apply feedback: Change MAYBE_UNRECOGNIZED_STR_TYPEFORM from unaccompanied note to standalone error
  • Apply feedback: Refactor extract save/restore of SemanticAnalyzer state to a new context manager
  • Apply feedback: Suppress SyntaxWarnings when parsing strings as types at the most-targeted location
  • Apply feedback: Add TypeForm profiling counters to SemanticAnalyzer and the --dump-build-stats option
    • Increase efficiency of quick rejection heuristic from 85.8% -> 99.6% in SemanticAnalyzer.try_parse_as_type_expression()
  • Apply feedback: Recognize assignment to union of TypeForm with non-TypeForm
  • Apply feedback: Alter primitives.pyi fixture rather than tuple.pyi and dict.pyi

Feedback NOT integrated, with rationale:

  • ✖️ Add tests related to recursive types
    • Recursive cases are already well-covered by tests related to TypeType (is_type_form=False).
    • I did find an infinite recursion bug affecting garden-variety Type[...], which I can fix in a separate PR.
  • ✖️ Define TypeForm(...) in value contexts as a regular function like Callable[[TypeForm[T]], TypeForm[T]] rather than as a special expression node (TypeFormExpr).
    • The special expression node allows mypy to print out better error messages when a user puts an invalid type expression inside TypeForm(...). See case 4 of testTypeFormExpression in check-typeform.test

There is one NOMERGE commit temporarily in this PR so that mypy_primer gives more insightful CI output:

  • NOMERGE: mypy_primer: Enable --enable-incomplete-feature=TypeForm when checking open source code

There is one commit unrelated to the core function of this PR that could be split to a separate PR:

  • Allow TypeAlias and PlaceholderNode to be stringified/printed

Closes #9773


(Most of the following description is copied from the original PR, except for the text in bold)

Implements the TypeForm PEP 747, as an opt-in feature enabled by the CLI flag --enable-incomplete-feature=TypeForm.

Implementation approach:

  • The TypeForm[T] is represented as a type using the existing TypeType class, with an is_type_form=True constructor parameter. Type[C] continues to be represented using TypeType, but with is_type_form=False (the default).

  • Recognizing a type expression literal such as int | str requires parsing an Expression as a type expression. Only the SemanticAnalyzer pass has the ability to parse arbitrary type expressions (including stringified annotations), using SemanticAnalyzer.expr_to_analyzed_type(). (I've extended the TypeChecker pass to parse all kinds of type expressions except stringified annotations, using the new TypeCheckerAsSemanticAnalyzer adapter.)

  • Therefore during the SemanticAnalyzer pass, at certain syntactic locations (i.e. assignment r-values, callable arguments, returned expressions), the analyzer tries to parse the Expression it is looking at using try_parse_as_type_expression() - a new function - and stores the result (a Type) in {IndexExpr, OpExpr, StrExpr}.as_type - a new attribute.

  • During the later TypeChecker pass, when looking at an Expression to determine its type, if the expression is in a type context that expects some kind of TypeForm[...] and the expression was successfully parsed as a type expression by the earlier SemanticAnalyzer pass (or can be parsed as a type expression immediately during the type checker pass), the expression will be given the type TypeForm[expr.as_type] rather than using the regular type inference rules for a value expression.

  • Key relationships between TypeForm[T], Type[C], and object types are defined in the visitors powering is_subtype, join_types, and meet_types.

  • The TypeForm(T) expression is recognized as a TypeFormExpr and has the return type TypeForm[T].

  • The new test suite in check-typeform.test is a good reference to the expected behaviors for operations that interact with TypeForm in some way.

Controversial parts of this PR, in @davidfstr 's opinion:

  • Type form literals containing stringified annotations are only recognized in certain syntactic locations (and not ALL possible locations). Namely they are recognized as (1) assignment r-values, (2) callable expression arguments, and (3) as returned expressions, but nowhere else. For example they aren't recognized in expressions like dict_with_typx_keys[int | str]. Attempting to use stringified annotations in other locations will emit a MAYBE_UNRECOGNIZED_STR_TYPEFORM error.

  • The existing TypeType class is now used to represent BOTH the Type[T] and TypeForm[T] types, rather than introducing a distinct subclass of Type to represent the TypeForm[T] type. This was done to simplify logic that manipulates both Type[T] and TypeForm[T] values, since they are both manipulated in very similar ways.

  • The "normalized" form of TypeForm[X | Y] - as returned by TypeType.make_normalized() - is just TypeForm[X | Y] rather than TypeForm[X] | TypeForm[Y], differing from the normalization behavior of Type[X | Y].

User must opt-in to use TypeForm with --enable-incomplete-feature=TypeForm

In particular:
* Recognize TypeForm[T] as a kind of type that can be used in a type expression
* Recognize a type expression literal as a TypeForm value in:
    - assignments
    - function calls
    - return statements
* Define the following relationships between TypeForm values:
    - is_subtype
    - join_types
    - meet_types
* Recognize the TypeForm(...) expression
* Alter isinstance(typx, type) to narrow TypeForm[T] to Type[T]
In particular:
- Adjust error messages to use lowercased type names, which is now the default
- Adjust error message to align with upstream stub changes
- Fix multiple definition of TypeForm in typing_extensions.pyi, because definition was added upstream
- Fix TypeType equality definition to recognize type forms
    - Fixes test: $ pytest -q -k testTypeFormToTypeAssignability
...at the most-targeted location

Specific warning:
* SyntaxWarning: invalid escape sequence '\('
...in SemanticAnalyzer.try_parse_as_type_expression()

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Aug 6, 2025

Diff from mypy_primer, showing the effect of this PR on open source code:

discord.py (https://github.com/Rapptz/discord.py)
- ...venv/lib/python3.13/site-packages/mypy/typeshed/stdlib/typing.pyi:1016: note: "update" of "TypedDict" defined here
+ ...venv/lib/python3.13/site-packages/mypy/typeshed/stdlib/typing.pyi:1017: note: "update" of "TypedDict" defined here

spark (https://github.com/apache/spark)
- python/pyspark/pandas/supported_api_gen.py:400: SyntaxWarning: invalid escape sequence '\_'
-   return func_str[:-1] + "\_"  # noqa: W605

@davidfstr
Copy link
Contributor Author

I fixed all the "only in CI" check issues so this PR is now actually ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeForm[T]: Spelling for regular types (int, str) & special forms (Union[int, str], Literal['foo'], etc)
1 participant