Skip to content

Commit 777159f

Browse files
encukoublaisepStanFromIrelandAA-Turner
authored
gh-135676: Lexical analysis: Reword String literals and related sections (GH-135942)
Co-authored-by: Blaise Pabon <blaise@gmail.com> Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com> Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
1 parent 6a285f9 commit 777159f

File tree

4 files changed

+460
-223
lines changed

4 files changed

+460
-223
lines changed

Doc/reference/expressions.rst

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -133,13 +133,18 @@ Literals
133133

134134
Python supports string and bytes literals and various numeric literals:
135135

136-
.. productionlist:: python-grammar
137-
literal: `stringliteral` | `bytesliteral` | `NUMBER`
136+
.. grammar-snippet::
137+
:group: python-grammar
138+
139+
literal: `strings` | `NUMBER`
138140

139141
Evaluation of a literal yields an object of the given type (string, bytes,
140142
integer, floating-point number, complex number) with the given value. The value
141143
may be approximated in the case of floating-point and imaginary (complex)
142-
literals. See section :ref:`literals` for details.
144+
literals.
145+
See section :ref:`literals` for details.
146+
See section :ref:`string-concatenation` for details on ``strings``.
147+
143148

144149
.. index::
145150
triple: immutable; data; type
@@ -152,6 +157,58 @@ occurrence) may obtain the same object or a different object with the same
152157
value.
153158

154159

160+
.. _string-concatenation:
161+
162+
String literal concatenation
163+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
164+
165+
Multiple adjacent string or bytes literals (delimited by whitespace), possibly
166+
using different quoting conventions, are allowed, and their meaning is the same
167+
as their concatenation::
168+
169+
>>> "hello" 'world'
170+
"helloworld"
171+
172+
Formally:
173+
174+
.. grammar-snippet::
175+
:group: python-grammar
176+
177+
strings: ( `STRING` | fstring)+ | tstring+
178+
179+
This feature is defined at the syntactical level, so it only works with literals.
180+
To concatenate string expressions at run time, the '+' operator may be used::
181+
182+
>>> greeting = "Hello"
183+
>>> space = " "
184+
>>> name = "Blaise"
185+
>>> print(greeting + space + name) # not: print(greeting space name)
186+
Hello Blaise
187+
188+
Literal concatenation can freely mix raw strings, triple-quoted strings,
189+
and formatted string literals.
190+
For example::
191+
192+
>>> "Hello" r', ' f"{name}!"
193+
"Hello, Blaise!"
194+
195+
This feature can be used to reduce the number of backslashes
196+
needed, to split long strings conveniently across long lines, or even to add
197+
comments to parts of strings. For example::
198+
199+
re.compile("[A-Za-z_]" # letter or underscore
200+
"[A-Za-z0-9_]*" # letter, digit or underscore
201+
)
202+
203+
However, bytes literals may only be combined with other byte literals;
204+
not with string literals of any kind.
205+
Also, template string literals may only be combined with other template
206+
string literals::
207+
208+
>>> t"Hello" t"{name}!"
209+
Template(strings=('Hello', '!'), interpolations=(...))
210+
211+
155212
.. _parenthesized:
156213

157214
Parenthesized forms

Doc/reference/grammar.rst

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,11 +10,8 @@ error recovery.
1010

1111
The notation used here is the same as in the preceding docs,
1212
and is described in the :ref:`notation <notation>` section,
13-
except for a few extra complications:
13+
except for an extra complication:
1414

15-
* ``&e``: a positive lookahead (that is, ``e`` is required to match but
16-
not consumed)
17-
* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match)
1815
* ``~`` ("cut"): commit to the current alternative and fail the rule
1916
even if this fails to parse
2017

Doc/reference/introduction.rst

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,15 +145,23 @@ The definition to the right of the colon uses the following syntax elements:
145145
* ``e?``: A question mark has exactly the same meaning as square brackets:
146146
the preceding item is optional.
147147
* ``(e)``: Parentheses are used for grouping.
148+
149+
The following notation is only used in
150+
:ref:`lexical definitions <notation-lexical-vs-syntactic>`.
151+
148152
* ``"a"..."z"``: Two literal characters separated by three dots mean a choice
149153
of any single character in the given (inclusive) range of ASCII characters.
150-
This notation is only used in
151-
:ref:`lexical definitions <notation-lexical-vs-syntactic>`.
152154
* ``<...>``: A phrase between angular brackets gives an informal description
153155
of the matched symbol (for example, ``<any ASCII character except "\">``),
154156
or an abbreviation that is defined in nearby text (for example, ``<Lu>``).
155-
This notation is only used in
156-
:ref:`lexical definitions <notation-lexical-vs-syntactic>`.
157+
158+
.. _lexical-lookaheads:
159+
160+
Some definitions also use *lookaheads*, which indicate that an element
161+
must (or must not) match at a given position, but without consuming any input:
162+
163+
* ``&e``: a positive lookahead (that is, ``e`` is required to match)
164+
* ``!e``: a negative lookahead (that is, ``e`` is required *not* to match)
157165

158166
The unary operators (``*``, ``+``, ``?``) bind as tightly as possible;
159167
the vertical bar (``|``) binds most loosely.

0 commit comments

Comments
 (0)