Submitted By: Ken Moffat <ken at linuxfromscratch dot org>
Date: 2021-04-01
Initial Package Version: 2.7.18
Upstream Status: Applied to Python-3
Origin: Found at Gentoo and Arch.
Description: Fixes various vulnerabilities - for Python3 some of these
are labelled as critical, but for Python2 in current BLFS (with its
limited use) these are more in the "plug holes to stop people chaining
vulnerabilities" camp. However, if anyone is still using python2 on a
much older BLFS system they might be vulnerable. Some of these
vulnerabilities were initially only reported against Python3, but they
do also apply to Python-2.7.18.

The vulnerabilities are CVE-2019-20907 (infinite loop), CVE-2020-8492
(DoS via regexp), CVE-2020-26116 (character injection in http.client),
CVE-2020-27619 CJK codec tests call eval() on content retrieved via
HTTP, CVE-2021-3177 buffer overflow may lead to remote code execution
in certain Python applications that accept floating-point numbers as
untrusted input, CVE-2021-23336 Web Cache Poisoning via urllib
functions when the attacker can separate query parameters using a
semicolon.

The names of most of these are the filenames from the gentoo tarball.

0001-bpo-39017-Avoid-infinite-loop-in-the-tarfile-module-.patch

From 893e6e3aee483d262df70656a68f63f601720fcd Mon Sep 17 00:00:00 2001
From: Rishi <rishi_devan@mail.com>
Date: Wed, 15 Jul 2020 13:51:00 +0200
Subject: [PATCH 01/24] bpo-39017: Avoid infinite loop in the tarfile module
 (GH-21454)

Avoid infinite loop when reading specially crafted TAR files using the tarfile module
(CVE-2019-20907).

[stripped test to avoid binary patch]
---
 Lib/tarfile.py                                                  | 2 ++
 .../next/Library/2020-07-12-22-16-58.bpo-39017.x3Cg-9.rst       | 1 +
 2 files changed, 3 insertions(+)
 create mode 100644 Misc/NEWS.d/next/Library/2020-07-12-22-16-58.bpo-39017.x3Cg-9.rst

diff --git a/Lib/tarfile.py b/Lib/tarfile.py
index adf91d5382..574a6bb279 100644
--- a/Lib/tarfile.py
+++ b/Lib/tarfile.py
@@ -1400,6 +1400,8 @@ class TarInfo(object):
 
             length, keyword = match.groups()
             length = int(length)
+            if length == 0:
+                raise InvalidHeaderError("invalid header")
             value = buf[match.end(2) + 1:match.start(1) + length - 1]
 
             keyword = keyword.decode("utf8")
diff --git a/Misc/NEWS.d/next/Library/2020-07-12-22-16-58.bpo-39017.x3Cg-9.rst b/Misc/NEWS.d/next/Library/2020-07-12-22-16-58.bpo-39017.x3Cg-9.rst
new file mode 100644
index 0000000000..ad26676f8b
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2020-07-12-22-16-58.bpo-39017.x3Cg-9.rst
@@ -0,0 +1 @@
+Avoid infinite loop when reading specially crafted TAR files using the tarfile module (CVE-2019-20907).
-- 
2.30.1

0002-bpo-39503-CVE-2020-8492-Fix-AbstractBasicAuthHandler.patch

From 2273e65e11dd0234f2f51ebaef61fc6e848d4059 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= <mgorny@gentoo.org>
Date: Thu, 10 Sep 2020 13:35:39 +0200
Subject: [PATCH 02/24] bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler
 (GH-18284) (GH-19304)

The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking. Vulnerability reported by Ben Caller
and Matt Schwager.

AbstractBasicAuthHandler of urllib.request now parses all
WWW-Authenticate HTTP headers and accepts multiple challenges per
header: use the realm of the first Basic challenge.

Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
(cherry picked from commit 0b297d4ff1c0e4480ad33acae793fbaf4bf015b4)

[rebased for py2.7]
---
 Lib/test/test_urllib2.py | 81 ++++++++++++++++++++++++++--------------
 Lib/urllib2.py           | 60 +++++++++++++++++++++++------
 2 files changed, 101 insertions(+), 40 deletions(-)

diff --git a/Lib/test/test_urllib2.py b/Lib/test/test_urllib2.py
index 20a0f58143..0adbb13c43 100644
--- a/Lib/test/test_urllib2.py
+++ b/Lib/test/test_urllib2.py
@@ -1128,42 +1128,67 @@ class HandlerTests(unittest.TestCase):
         self.assertEqual(req.get_host(), "proxy.example.com:3128")
         self.assertEqual(req.get_header("Proxy-authorization"),"FooBar")
 
-    def test_basic_auth(self, quote_char='"'):
+    def check_basic_auth(self, headers, realm):
         opener = OpenerDirector()
         password_manager = MockPasswordManager()
         auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)
-        realm = "ACME Widget Store"
-        http_handler = MockHTTPHandler(
-            401, 'WWW-Authenticate: Basic realm=%s%s%s\r\n\r\n' %
-            (quote_char, realm, quote_char) )
+        body = '\r\n'.join(headers) + '\r\n\r\n'
+        http_handler = MockHTTPHandler(401, body)
         opener.add_handler(auth_handler)
         opener.add_handler(http_handler)
         self._test_basic_auth(opener, auth_handler, "Authorization",
                               realm, http_handler, password_manager,
                               "http://acme.example.com/protected",
-                              "http://acme.example.com/protected"
-                             )
-
-    def test_basic_auth_with_single_quoted_realm(self):
-        self.test_basic_auth(quote_char="'")
-
-    def test_basic_auth_with_unquoted_realm(self):
-        opener = OpenerDirector()
-        password_manager = MockPasswordManager()
-        auth_handler = urllib2.HTTPBasicAuthHandler(password_manager)
-        realm = "ACME Widget Store"
-        http_handler = MockHTTPHandler(
-            401, 'WWW-Authenticate: Basic realm=%s\r\n\r\n' % realm)
-        opener.add_handler(auth_handler)
-        opener.add_handler(http_handler)
-        msg = "Basic Auth Realm was unquoted"
-        with test_support.check_warnings((msg, UserWarning)):
-            self._test_basic_auth(opener, auth_handler, "Authorization",
-                                  realm, http_handler, password_manager,
-                                  "http://acme.example.com/protected",
-                                  "http://acme.example.com/protected"
-                                 )
-
+                              "http://acme.example.com/protected")
+
+    def test_basic_auth(self):
+        realm = "realm2@example.com"
+        realm2 = "realm2@example.com"
+        basic = 'Basic realm="{realm}"'.format(realm=realm)
+        basic2 = 'Basic realm="{realm2}"'.format(realm2=realm2)
+        other_no_realm = 'Otherscheme xxx'
+        digest = ('Digest realm="{realm2}", '
+                  'qop="auth, auth-int", '
+                  'nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", '
+                  'opaque="5ccc069c403ebaf9f0171e9517f40e41"'
+                  .format(realm2=realm2))
+        for realm_str in (
+            # test "quote" and 'quote'
+            'Basic realm="{realm}"'.format(realm=realm),
+            "Basic realm='{realm}'".format(realm=realm),
+
+            # charset is ignored
+            'Basic realm="{realm}", charset="UTF-8"'.format(realm=realm),
+
+            # Multiple challenges per header
+            ', '.join((basic, basic2)),
+            ', '.join((basic, other_no_realm)),
+            ', '.join((other_no_realm, basic)),
+            ', '.join((basic, digest)),
+            ', '.join((digest, basic)),
+        ):
+            headers = ['WWW-Authenticate: {realm_str}'
+                       .format(realm_str=realm_str)]
+            self.check_basic_auth(headers, realm)
+
+        # no quote: expect a warning
+        with test_support.check_warnings(("Basic Auth Realm was unquoted",
+                                     UserWarning)):
+            headers = ['WWW-Authenticate: Basic realm={realm}'
+                       .format(realm=realm)]
+            self.check_basic_auth(headers, realm)
+
+        # Multiple headers: one challenge per header.
+        # Use the first Basic realm.
+        for challenges in (
+            [basic,  basic2],
+            [basic,  digest],
+            [digest, basic],
+        ):
+            headers = ['WWW-Authenticate: {challenge}'
+                       .format(challenge=challenge)
+                       for challenge in challenges]
+            self.check_basic_auth(headers, realm)
 
     def test_proxy_basic_auth(self):
         opener = OpenerDirector()
diff --git a/Lib/urllib2.py b/Lib/urllib2.py
index 8b634ada37..b2d1fad6f2 100644
--- a/Lib/urllib2.py
+++ b/Lib/urllib2.py
@@ -856,8 +856,15 @@ class AbstractBasicAuthHandler:
 
     # allow for double- and single-quoted realm values
     # (single quotes are a violation of the RFC, but appear in the wild)
-    rx = re.compile('(?:.*,)*[ \t]*([^ \t]+)[ \t]+'
-                    'realm=(["\']?)([^"\']*)\\2', re.I)
+    rx = re.compile('(?:^|,)'   # start of the string or ','
+                    '[ \t]*'    # optional whitespaces
+                    '([^ \t]+)' # scheme like "Basic"
+                    '[ \t]+'    # mandatory whitespaces
+                    # realm=xxx
+                    # realm='xxx'
+                    # realm="xxx"
+                    'realm=(["\']?)([^"\']*)\\2',
+                    re.I)
 
     # XXX could pre-emptively send auth info already accepted (RFC 2617,
     # end of section 2, and section 1.2 immediately after "credentials"
@@ -869,23 +876,52 @@ class AbstractBasicAuthHandler:
         self.passwd = password_mgr
         self.add_password = self.passwd.add_password
 
+    def _parse_realm(self, header):
+        # parse WWW-Authenticate header: accept multiple challenges per header
+        found_challenge = False
+        for mo in AbstractBasicAuthHandler.rx.finditer(header):
+            scheme, quote, realm = mo.groups()
+            if quote not in ['"', "'"]:
+                warnings.warn("Basic Auth Realm was unquoted",
+                              UserWarning, 3)
+
+            yield (scheme, realm)
+
+            found_challenge = True
+
+        if not found_challenge:
+            if header:
+                scheme = header.split()[0]
+            else:
+                scheme = ''
+            yield (scheme, None)
 
     def http_error_auth_reqed(self, authreq, host, req, headers):
         # host may be an authority (without userinfo) or a URL with an
         # authority
-        # XXX could be multiple headers
-        authreq = headers.get(authreq, None)
+        headers = headers.getheaders(authreq)
+        if not headers:
+            # no header found
+            return
 
-        if authreq:
-            mo = AbstractBasicAuthHandler.rx.search(authreq)
-            if mo:
-                scheme, quote, realm = mo.groups()
-                if quote not in ['"', "'"]:
-                    warnings.warn("Basic Auth Realm was unquoted",
-                                  UserWarning, 2)
-                if scheme.lower() == 'basic':
+        unsupported = None
+        for header in headers:
+            for scheme, realm in self._parse_realm(header):
+                if scheme.lower() != 'basic':
+                    unsupported = scheme
+                    continue
+
+                if realm is not None:
+                    # Use the first matching Basic challenge.
+                    # Ignore following challenges even if they use the Basic
+                    # scheme.
                     return self.retry_http_basic_auth(host, req, realm)
 
+        if unsupported is not None:
+            raise ValueError("AbstractBasicAuthHandler does not "
+                             "support the following scheme: %r"
+                             % (scheme,))
+
     def retry_http_basic_auth(self, host, req, realm):
         user, pw = self.passwd.find_user_password(realm, host)
         if pw is not None:
-- 
2.30.1

0003-bpo-39603-Prevent-header-injection-in-http-methods-G.patch

From 138e2caeb4827ccfd1eaff2cf63afb79dfeeb3c4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= <mgorny@gentoo.org>
Date: Thu, 10 Sep 2020 13:39:48 +0200
Subject: [PATCH 03/24] bpo-39603: Prevent header injection in http methods
 (GH-18485) (GH-21539)

reject control chars in http method in http.client.putrequest to prevent http header injection
(cherry picked from commit 8ca8a2e8fb068863c1138f07e3098478ef8be12e)

Co-authored-by: AMIR <31338382+amiremohamadi@users.noreply.github.com>

[rebased for py2.7]
---
 Lib/httplib.py           | 17 +++++++++++++++++
 Lib/test/test_httplib.py | 20 ++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/Lib/httplib.py b/Lib/httplib.py
index fcc4152aaf..81a08d5d71 100644
--- a/Lib/httplib.py
+++ b/Lib/httplib.py
@@ -257,6 +257,10 @@ _contains_disallowed_url_pchar_re = re.compile('[\x00-\x20\x7f-\xff]')
 #  _is_allowed_url_pchars_re = re.compile(r"^[/!$&'()*+,;=:@%a-zA-Z0-9._~-]+$")
 # We are more lenient for assumed real world compatibility purposes.
 
+# These characters are not allowed within HTTP method names
+# to prevent http header injection.
+_contains_disallowed_method_pchar_re = re.compile('[\x00-\x1f]')
+
 # We always set the Content-Length header for these methods because some
 # servers will otherwise respond with a 411
 _METHODS_EXPECTING_BODY = {'PATCH', 'POST', 'PUT'}
@@ -935,6 +939,8 @@ class HTTPConnection:
         else:
             raise CannotSendRequest()
 
+        self._validate_method(method)
+
         # Save the method for use later in the response phase
         self._method = method
 
@@ -1020,6 +1026,17 @@ class HTTPConnection:
         # On Python 2, request is already encoded (default)
         return request
 
+    def _validate_method(self, method):
+        """Validate a method name for putrequest."""
+        # prevent http header injection
+        match = _contains_disallowed_method_pchar_re.search(method)
+        if match:
+            msg = (
+                "method can't contain control characters. {method!r} "
+                "(found at least {matched!r})"
+            ).format(matched=match.group(), method=method)
+            raise ValueError(msg)
+
     def _validate_path(self, url):
         """Validate a url for putrequest."""
         # Prevent CVE-2019-9740.
diff --git a/Lib/test/test_httplib.py b/Lib/test/test_httplib.py
index d8a57f7353..e20a0986dc 100644
--- a/Lib/test/test_httplib.py
+++ b/Lib/test/test_httplib.py
@@ -384,6 +384,26 @@ class HeaderTests(TestCase):
             with self.assertRaisesRegexp(ValueError, 'Invalid header'):
                 conn.putheader(name, value)
 
+    def test_invalid_method_names(self):
+        methods = (
+            'GET\r',
+            'POST\n',
+            'PUT\n\r',
+            'POST\nValue',
+            'POST\nHOST:abc',
+            'GET\nrHost:abc\n',
+            'POST\rRemainder:\r',
+            'GET\rHOST:\n',
+            '\nPUT'
+        )
+
+        for method in methods:
+            with self.assertRaisesRegexp(
+                    ValueError, "method can't contain control characters"):
+                conn = httplib.HTTPConnection('example.com')
+                conn.sock = FakeSocket(None)
+                conn.request(method=method, url="/")
+
 
 class BasicTest(TestCase):
     def test_status_lines(self):
-- 
2.30.1

0004-bpo-42051-Reject-XML-entity-declarations-in-plist-fi.patch

From dd9ccc8454250bb4c2e2fe517edbbbbe7d759e12 Mon Sep 17 00:00:00 2001
From: "Miss Skeleton (bot)" <31488909+miss-islington@users.noreply.github.com>
Date: Mon, 19 Oct 2020 21:38:30 -0700
Subject: [PATCH 04/24] bpo-42051: Reject XML entity declarations in plist
 files (GH-22760) (GH-22801) (GH-22804)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
(cherry picked from commit e512bc799e3864fe3b1351757261762d63471efc)

Co-authored-by: Ned Deily <nad@python.org>

Rebased for Python 2.7 by Michał Górny <mgorny@gentoo.org>
---
 Lib/plistlib.py                                |  7 +++++++
 Lib/test/test_plistlib.py                      | 18 ++++++++++++++++++
 .../2020-10-19-10-56-27.bpo-42051.EU_B7u.rst   |  3 +++
 3 files changed, 28 insertions(+)
 create mode 100644 Misc/NEWS.d/next/Security/2020-10-19-10-56-27.bpo-42051.EU_B7u.rst

diff --git a/Lib/plistlib.py b/Lib/plistlib.py
index 42897b8da8..2c2b7fb635 100644
--- a/Lib/plistlib.py
+++ b/Lib/plistlib.py
@@ -403,9 +403,16 @@ class PlistParser:
         parser.StartElementHandler = self.handleBeginElement
         parser.EndElementHandler = self.handleEndElement
         parser.CharacterDataHandler = self.handleData
+        parser.EntityDeclHandler = self.handleEntityDecl
         parser.ParseFile(fileobj)
         return self.root
 
+    def handleEntityDecl(self, entity_name, is_parameter_entity, value, base, system_id, public_id, notation_name):
+        # Reject plist files with entity declarations to avoid XML vulnerabilies in expat.
+        # Regular plist files don't contain those declerations, and Apple's plutil tool does not
+        # accept them either.
+        raise ValueError("XML entity declarations are not supported in plist files")
+
     def handleBeginElement(self, element, attrs):
         self.data = []
         handler = getattr(self, "begin_" + element, None)
diff --git a/Lib/test/test_plistlib.py b/Lib/test/test_plistlib.py
index 7859ad0572..612a1d2d6e 100644
--- a/Lib/test/test_plistlib.py
+++ b/Lib/test/test_plistlib.py
@@ -86,6 +86,19 @@ TESTDATA = """<?xml version="1.0" encoding="UTF-8"?>
 </plist>
 """.replace(" " * 8, "\t")  # Apple as well as plistlib.py output hard tabs
 
+XML_PLIST_WITH_ENTITY=b'''\
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd" [
+   <!ENTITY entity "replacement text">
+  ]>
+<plist version="1.0">
+  <dict>
+    <key>A</key>
+    <string>&entity;</string>
+  </dict>
+</plist>
+'''
+
 
 class TestPlistlib(unittest.TestCase):
 
@@ -195,6 +208,11 @@ class TestPlistlib(unittest.TestCase):
         self.assertEqual(test1, result1)
         self.assertEqual(test2, result2)
 
+    def test_xml_plist_with_entity_decl(self):
+        with self.assertRaisesRegexp(ValueError,
+                                     "XML entity declarations are not supported"):
+            plistlib.readPlistFromString(XML_PLIST_WITH_ENTITY)
+
 
 def test_main():
     test_support.run_unittest(TestPlistlib)
diff --git a/Misc/NEWS.d/next/Security/2020-10-19-10-56-27.bpo-42051.EU_B7u.rst b/Misc/NEWS.d/next/Security/2020-10-19-10-56-27.bpo-42051.EU_B7u.rst
new file mode 100644
index 0000000000..e865ed12a0
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2020-10-19-10-56-27.bpo-42051.EU_B7u.rst
@@ -0,0 +1,3 @@
+The :mod:`plistlib` module no longer accepts entity declarations in XML
+plist files to avoid XML vulnerabilities. This should not affect users as
+entity declarations are not used in regular plist files.
-- 
2.30.1

0005-bpo-41944-No-longer-call-eval-on-content-received-vi.patch

From 6a6c4240fa1e628dbcca09fdde39aea4d8eb6138 Mon Sep 17 00:00:00 2001
From: "Miss Skeleton (bot)" <31488909+miss-islington@users.noreply.github.com>
Date: Mon, 19 Oct 2020 21:46:10 -0700
Subject: [PATCH 05/24] bpo-41944: No longer call eval() on content received
 via HTTP in the CJK codec tests (GH-22566) (GH-22579)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

(cherry picked from commit 2ef5caa58febc8968e670e39e3d37cf8eef3cab8)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

Rebased for Python 2.7 by Michał Górny <mgorny@gentoo.org>
---
 Lib/test/multibytecodec_support.py            | 23 +++++++------------
 .../2020-10-05-17-43-46.bpo-41944.rf1dYb.rst  |  1 +
 2 files changed, 9 insertions(+), 15 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Tests/2020-10-05-17-43-46.bpo-41944.rf1dYb.rst

diff --git a/Lib/test/multibytecodec_support.py b/Lib/test/multibytecodec_support.py
index 5b2329b6d8..b7d7a3aba7 100644
--- a/Lib/test/multibytecodec_support.py
+++ b/Lib/test/multibytecodec_support.py
@@ -279,30 +279,23 @@ class TestBase_Mapping(unittest.TestCase):
             self._test_mapping_file_plain()
 
     def _test_mapping_file_plain(self):
-        _unichr = lambda c: eval("u'\\U%08x'" % int(c, 16))
-        unichrs = lambda s: u''.join(_unichr(c) for c in s.split('+'))
+        def unichrs(s):
+            return ''.join(chr(int(x, 16)) for x in s.split('+'))
+
         urt_wa = {}
 
         with self.open_mapping_file() as f:
             for line in f:
                 if not line:
                     break
-                data = line.split('#')[0].strip().split()
+                data = line.split('#')[0].split()
                 if len(data) != 2:
                     continue
 
-                csetval = eval(data[0])
-                if csetval <= 0x7F:
-                    csetch = chr(csetval & 0xff)
-                elif csetval >= 0x1000000:
-                    csetch = chr(csetval >> 24) + chr((csetval >> 16) & 0xff) + \
-                             chr((csetval >> 8) & 0xff) + chr(csetval & 0xff)
-                elif csetval >= 0x10000:
-                    csetch = chr(csetval >> 16) + \
-                             chr((csetval >> 8) & 0xff) + chr(csetval & 0xff)
-                elif csetval >= 0x100:
-                    csetch = chr(csetval >> 8) + chr(csetval & 0xff)
-                else:
+                if data[0][:2] != '0x':
+                    self.fail("Invalid line: {line!r}".format(line=line))
+                csetch = bytes.fromhex(data[0][2:])
+                if len(csetch) == 1 and 0x80 <= csetch[0]:
                     continue
 
                 unich = unichrs(data[1])
diff --git a/Misc/NEWS.d/next/Tests/2020-10-05-17-43-46.bpo-41944.rf1dYb.rst b/Misc/NEWS.d/next/Tests/2020-10-05-17-43-46.bpo-41944.rf1dYb.rst
new file mode 100644
index 0000000000..4f9782f1c8
--- /dev/null
+++ b/Misc/NEWS.d/next/Tests/2020-10-05-17-43-46.bpo-41944.rf1dYb.rst
@@ -0,0 +1 @@
+Tests for CJK codecs no longer call ``eval()`` on content received via HTTP.
-- 
2.30.1

0006-bpo-40791-Make-compare_digest-more-constant-time.-GH.patch


From bfc498a6c971c7393d37c25bdcf5f892afb16ed2 Mon Sep 17 00:00:00 2001
From: "Miss Islington (bot)"
 <31488909+miss-islington@users.noreply.github.com>
Date: Sun, 22 Nov 2020 09:33:09 -0800
Subject: [PATCH 06/24] bpo-40791: Make compare_digest more constant-time.
 (GH-23438)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The existing volatile `left`/`right` pointers guarantee that the reads will all occur, but does not guarantee that they will be _used_. So a compiler can still short-circuit the loop, saving e.g. the overhead of doing the xors and especially the overhead of the data dependency between `result` and the reads. That would change performance depending on where the first unequal byte occurs. This change removes that optimization.

(This is change GH-1 from https://bugs.python.org/issue40791 .)
(cherry picked from commit 31729366e2bc09632e78f3896dbce0ae64914f28)

Co-authored-by: Devin Jeanpierre <jeanpierreda@google.com>

Rebased for Python 2.7 by Michał Górny <mgorny@gentoo.org>
---
 .../next/Security/2020-05-28-06-06-47.bpo-40791.QGZClX.rst      | 1 +
 Modules/operator.c                                              | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
 create mode 100644 Misc/NEWS.d/next/Security/2020-05-28-06-06-47.bpo-40791.QGZClX.rst

diff --git a/Misc/NEWS.d/next/Security/2020-05-28-06-06-47.bpo-40791.QGZClX.rst b/Misc/NEWS.d/next/Security/2020-05-28-06-06-47.bpo-40791.QGZClX.rst
new file mode 100644
index 0000000000..69b9de1bea
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2020-05-28-06-06-47.bpo-40791.QGZClX.rst
@@ -0,0 +1 @@
+Add ``volatile`` to the accumulator variable in ``hmac.compare_digest``, making constant-time-defeating optimizations less likely.
\ No newline at end of file
diff --git a/Modules/operator.c b/Modules/operator.c
index 7ddd123f40..67011a6a82 100644
--- a/Modules/operator.c
+++ b/Modules/operator.c
@@ -259,7 +259,7 @@ _tscmp(const unsigned char *a, const unsigned char *b,
     volatile const unsigned char *left;
     volatile const unsigned char *right;
     Py_ssize_t i;
-    unsigned char result;
+    volatile unsigned char result;
 
     /* loop count depends on length of b */
     length = len_b;
-- 
2.30.1

0007-3.6-closes-bpo-42938-Replace-snprintf-with-Python-un.patch

From fab838b2ee7cfb9037c24f0f18dfe01aa379b3f7 Mon Sep 17 00:00:00 2001
From: Benjamin Peterson <benjamin@python.org>
Date: Mon, 18 Jan 2021 15:11:46 -0600
Subject: [PATCH 07/24] [3.6] closes bpo-42938: Replace snprintf with Python
 unicode formatting in ctypes param reprs. (GH-24250)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

(cherry picked from commit 916610ef90a0d0761f08747f7b0905541f0977c7)

Co-authored-by: Benjamin Peterson <benjamin@python.org>
Rebased for Python 2.7 by Michał Górny <mgorny@gentoo.org>
---
 Lib/ctypes/test/test_parameters.py            | 43 ++++++++++++++++
 .../2021-01-18-09-27-31.bpo-42938.4Zn4Mp.rst  |  2 +
 Modules/_ctypes/callproc.c                    | 49 +++++++++----------
 3 files changed, 69 insertions(+), 25 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Security/2021-01-18-09-27-31.bpo-42938.4Zn4Mp.rst

diff --git a/Lib/ctypes/test/test_parameters.py b/Lib/ctypes/test/test_parameters.py
index 23c1b6e225..3456882ccb 100644
--- a/Lib/ctypes/test/test_parameters.py
+++ b/Lib/ctypes/test/test_parameters.py
@@ -206,6 +206,49 @@ class SimpleTypesTestCase(unittest.TestCase):
         with self.assertRaises(ZeroDivisionError):
             WorseStruct().__setstate__({}, b'foo')
 
+    def test_parameter_repr(self):
+        from ctypes import (
+            c_bool,
+            c_char,
+            c_wchar,
+            c_byte,
+            c_ubyte,
+            c_short,
+            c_ushort,
+            c_int,
+            c_uint,
+            c_long,
+            c_ulong,
+            c_longlong,
+            c_ulonglong,
+            c_float,
+            c_double,
+            c_longdouble,
+            c_char_p,
+            c_wchar_p,
+            c_void_p,
+        )
+        self.assertRegexpMatches(repr(c_bool.from_param(True)), r"^<cparam '\?' at 0x[A-Fa-f0-9]+>$")
+        self.assertEqual(repr(c_char.from_param('a')), "<cparam 'c' (a)>")
+        self.assertRegexpMatches(repr(c_wchar.from_param('a')), r"^<cparam 'u' at 0x[A-Fa-f0-9]+>$")
+        self.assertEqual(repr(c_byte.from_param(98)), "<cparam 'b' (98)>")
+        self.assertEqual(repr(c_ubyte.from_param(98)), "<cparam 'B' (98)>")
+        self.assertEqual(repr(c_short.from_param(511)), "<cparam 'h' (511)>")
+        self.assertEqual(repr(c_ushort.from_param(511)), "<cparam 'H' (511)>")
+        self.assertRegexpMatches(repr(c_int.from_param(20000)), r"^<cparam '[li]' \(20000\)>$")
+        self.assertRegexpMatches(repr(c_uint.from_param(20000)), r"^<cparam '[LI]' \(20000\)>$")
+        self.assertRegexpMatches(repr(c_long.from_param(20000)), r"^<cparam '[li]' \(20000\)>$")
+        self.assertRegexpMatches(repr(c_ulong.from_param(20000)), r"^<cparam '[LI]' \(20000\)>$")
+        self.assertRegexpMatches(repr(c_longlong.from_param(20000)), r"^<cparam '[liq]' \(20000\)>$")
+        self.assertRegexpMatches(repr(c_ulonglong.from_param(20000)), r"^<cparam '[LIQ]' \(20000\)>$")
+        self.assertEqual(repr(c_float.from_param(1.5)), "<cparam 'f' (1.5)>")
+        self.assertEqual(repr(c_double.from_param(1.5)), "<cparam 'd' (1.5)>")
+        self.assertEqual(repr(c_double.from_param(1e300)), "<cparam 'd' (1e+300)>")
+        self.assertRegexpMatches(repr(c_longdouble.from_param(1.5)), r"^<cparam ('d' \(1.5\)|'g' at 0x[A-Fa-f0-9]+)>$")
+        self.assertRegexpMatches(repr(c_char_p.from_param(b'hihi')), "^<cparam 'z' \(0x[A-Fa-f0-9]+\)>$")
+        self.assertRegexpMatches(repr(c_wchar_p.from_param('hihi')), "^<cparam 'Z' \(0x[A-Fa-f0-9]+\)>$")
+        self.assertRegexpMatches(repr(c_void_p.from_param(0x12)), r"^<cparam 'P' \(0x0*12\)>$")
+
 ################################################################
 
 if __name__ == '__main__':
diff --git a/Misc/NEWS.d/next/Security/2021-01-18-09-27-31.bpo-42938.4Zn4Mp.rst b/Misc/NEWS.d/next/Security/2021-01-18-09-27-31.bpo-42938.4Zn4Mp.rst
new file mode 100644
index 0000000000..7df65a156f
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-01-18-09-27-31.bpo-42938.4Zn4Mp.rst
@@ -0,0 +1,2 @@
+Avoid static buffers when computing the repr of :class:`ctypes.c_double` and
+:class:`ctypes.c_longdouble` values.
diff --git a/Modules/_ctypes/callproc.c b/Modules/_ctypes/callproc.c
index 066fefc0cc..421addf353 100644
--- a/Modules/_ctypes/callproc.c
+++ b/Modules/_ctypes/callproc.c
@@ -460,50 +460,51 @@ PyCArg_dealloc(PyCArgObject *self)
 static PyObject *
 PyCArg_repr(PyCArgObject *self)
 {
-    char buffer[256];
     switch(self->tag) {
     case 'b':
     case 'B':
-        sprintf(buffer, "<cparam '%c' (%d)>",
+        return PyString_FromFormat("<cparam '%c' (%d)>",
             self->tag, self->value.b);
-        break;
     case 'h':
     case 'H':
-        sprintf(buffer, "<cparam '%c' (%d)>",
+        return PyString_FromFormat("<cparam '%c' (%d)>",
             self->tag, self->value.h);
-        break;
     case 'i':
     case 'I':
-        sprintf(buffer, "<cparam '%c' (%d)>",
+        return PyString_FromFormat("<cparam '%c' (%d)>",
             self->tag, self->value.i);
-        break;
     case 'l':
     case 'L':
-        sprintf(buffer, "<cparam '%c' (%ld)>",
+        return PyString_FromFormat("<cparam '%c' (%ld)>",
             self->tag, self->value.l);
-        break;
 
 #ifdef HAVE_LONG_LONG
     case 'q':
     case 'Q':
-        sprintf(buffer,
+        return PyString_FromFormat(
             "<cparam '%c' (%" PY_FORMAT_LONG_LONG "d)>",
             self->tag, self->value.q);
-        break;
 #endif
     case 'd':
-        sprintf(buffer, "<cparam '%c' (%f)>",
-            self->tag, self->value.d);
-        break;
-    case 'f':
-        sprintf(buffer, "<cparam '%c' (%f)>",
-            self->tag, self->value.f);
-        break;
-
+    case 'f': {
+        PyObject *f = PyFloat_FromDouble((self->tag == 'f') ? self->value.f : self->value.d);
+        if (f == NULL) {
+            return NULL;
+        }
+        PyObject *r = PyObject_Repr(f);
+        if (r == NULL) {
+            Py_DECREF(f);
+            return NULL;
+        }
+        PyObject *result = PyString_FromFormat(
+            "<cparam '%c' (%s)>", self->tag, PyString_AsString(r));
+        Py_DECREF(r);
+        Py_DECREF(f);
+        return result;
+    }
     case 'c':
-        sprintf(buffer, "<cparam '%c' (%c)>",
+        return PyString_FromFormat("<cparam '%c' (%c)>",
             self->tag, self->value.c);
-        break;
 
 /* Hm, are these 'z' and 'Z' codes useful at all?
    Shouldn't they be replaced by the functionality of c_string
@@ -512,16 +513,14 @@ PyCArg_repr(PyCArgObject *self)
     case 'z':
     case 'Z':
     case 'P':
-        sprintf(buffer, "<cparam '%c' (%p)>",
+        return PyString_FromFormat("<cparam '%c' (%p)>",
             self->tag, self->value.p);
         break;
 
     default:
-        sprintf(buffer, "<cparam '%c' at %p>",
+        return PyString_FromFormat("<cparam '%c' at %p>",
             self->tag, self);
-        break;
     }
-    return PyString_FromString(buffer);
 }
 
 static PyMemberDef PyCArgType_members[] = {
-- 
2.30.1

0024-3.6-bpo-42967-only-use-as-a-query-string-separator-G.patch

From e7b005c05dbdbce967a409abd71641281a8604bf Mon Sep 17 00:00:00 2001
From: Senthil Kumaran <senthil@uthcode.com>
Date: Mon, 15 Feb 2021 11:16:43 -0800
Subject: [PATCH 24/24] [3.6] bpo-42967: only use '&' as a query string
 separator (GH-24297)  (GH-24532)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

bpo-42967: [security] Address a web cache-poisoning issue reported in
urllib.parse.parse_qsl().

urllib.parse will only us "&" as query string separator by default
instead of both ";" and "&" as allowed in earlier versions. An optional
argument seperator with default value "&" is added to specify the
separator.

Co-authored-by: Éric Araujo <merwok@netwok.org>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com>

Rebased for Python 2.7 by Michał Górny
---
 Doc/library/cgi.rst                           |  7 +++-
 Doc/library/urlparse.rst                      | 23 ++++++++++-
 Lib/cgi.py                                    | 20 +++++++---
 Lib/test/test_cgi.py                          | 29 +++++++++++---
 Lib/test/test_urlparse.py                     | 38 +++++++++----------
 Lib/urlparse.py                               | 22 ++++++++---
 .../2021-02-14-15-59-16.bpo-42967.YApqDS.rst  |  1 +
 7 files changed, 100 insertions(+), 40 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst

diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst
index ecd62c8c01..b85cdd8b61 100644
--- a/Doc/library/cgi.rst
+++ b/Doc/library/cgi.rst
@@ -285,10 +285,10 @@ These are useful if you want more control, or if you want to employ some of the
 algorithms implemented in this module in other circumstances.
 
 
-.. function:: parse(fp[, environ[, keep_blank_values[, strict_parsing]]])
+.. function:: parse(fp[, environ[, keep_blank_values[, strict_parsing]]], separator="&")
 
    Parse a query in the environment or from a file (the file defaults to
-   ``sys.stdin`` and environment defaults to ``os.environ``).  The *keep_blank_values* and *strict_parsing* parameters are
+   ``sys.stdin`` and environment defaults to ``os.environ``).  The *keep_blank_values*, *strict_parsing* and *separator* parameters are
    passed to :func:`urlparse.parse_qs` unchanged.
 
 
@@ -316,6 +316,9 @@ algorithms implemented in this module in other circumstances.
    Note that this does not parse nested multipart parts --- use
    :class:`FieldStorage` for that.
 
+   .. versionchanged:: 3.6.13
+      Added the *separator* parameter.
+
 
 .. function:: parse_header(string)
 
diff --git a/Doc/library/urlparse.rst b/Doc/library/urlparse.rst
index 0989c88c30..2f8e4c5a44 100644
--- a/Doc/library/urlparse.rst
+++ b/Doc/library/urlparse.rst
@@ -136,7 +136,7 @@ The :mod:`urlparse` module defines the following functions:
       now raise :exc:`ValueError`.
 
 
-.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
+.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]], separator='&')
 
    Parse a query string given as a string argument (data of type
    :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a
@@ -157,6 +157,9 @@ The :mod:`urlparse` module defines the following functions:
    read. If set, then throws a :exc:`ValueError` if there are more than
    *max_num_fields* fields read.
 
+   The optional argument *separator* is the symbol to use for separating the
+   query arguments. It defaults to ``&``.
+
    Use the :func:`urllib.urlencode` function to convert such dictionaries into
    query strings.
 
@@ -166,7 +169,14 @@ The :mod:`urlparse` module defines the following functions:
    .. versionchanged:: 2.7.16
       Added *max_num_fields* parameter.
 
-.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
+   .. versionchanged:: 2.7.18-gentoo
+      Added *separator* parameter with the default value of ``&``. Earlier
+      Python versions allowed using both ``;`` and ``&`` as query parameter
+      separator. This has been changed to allow only a single separator key,
+      with ``&`` as the default separator.
+
+
+.. function:: parse_qsl(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]], separator='&')
 
    Parse a query string given as a string argument (data of type
    :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a list of
@@ -186,6 +196,9 @@ The :mod:`urlparse` module defines the following functions:
    read. If set, then throws a :exc:`ValueError` if there are more than
    *max_num_fields* fields read.
 
+   The optional argument *separator* is the symbol to use for separating the
+   query arguments. It defaults to ``&``.
+
    Use the :func:`urllib.urlencode` function to convert such lists of pairs into
    query strings.
 
@@ -195,6 +208,12 @@ The :mod:`urlparse` module defines the following functions:
    .. versionchanged:: 2.7.16
       Added *max_num_fields* parameter.
 
+   .. versionchanged:: 2.7.18-gentoo
+      Added *separator* parameter with the default value of ``&``. Earlier
+      Python versions allowed using both ``;`` and ``&`` as query parameter
+      separator. This has been changed to allow only a single separator key,
+      with ``&`` as the default separator.
+
 .. function:: urlunparse(parts)
 
    Construct a URL from a tuple as returned by ``urlparse()``. The *parts* argument
diff --git a/Lib/cgi.py b/Lib/cgi.py
index 5b903e0347..9d0848b6b1 100755
--- a/Lib/cgi.py
+++ b/Lib/cgi.py
@@ -121,7 +121,8 @@ log = initlog           # The current logging function
 # 0 ==> unlimited input
 maxlen = 0
 
-def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
+def parse(fp=None, environ=os.environ, keep_blank_values=0,
+          strict_parsing=0, separator='&'):
     """Parse a query in the environment or from a file (default stdin)
 
         Arguments, all optional:
@@ -140,6 +141,9 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
         strict_parsing: flag indicating what to do with parsing errors.
             If false (the default), errors are silently ignored.
             If true, errors raise a ValueError exception.
+
+        separator: str. The symbol to use for separating the query arguments.
+            Defaults to &.
     """
     if fp is None:
         fp = sys.stdin
@@ -171,7 +175,8 @@ def parse(fp=None, environ=os.environ, keep_blank_values=0, strict_parsing=0):
         else:
             qs = ""
         environ['QUERY_STRING'] = qs    # XXX Shouldn't, really
-    return urlparse.parse_qs(qs, keep_blank_values, strict_parsing)
+    return urlparse.parse_qs(qs, keep_blank_values, strict_parsing,
+                             separator=separator)
 
 
 # parse query string function called from urlparse,
@@ -395,7 +400,7 @@ class FieldStorage:
 
     def __init__(self, fp=None, headers=None, outerboundary="",
                  environ=os.environ, keep_blank_values=0, strict_parsing=0,
-                 max_num_fields=None):
+                 max_num_fields=None, separator='&'):
         """Constructor.  Read multipart/* until last part.
 
         Arguments, all optional:
@@ -430,6 +435,7 @@ class FieldStorage:
         self.keep_blank_values = keep_blank_values
         self.strict_parsing = strict_parsing
         self.max_num_fields = max_num_fields
+        self.separator = separator
         if 'REQUEST_METHOD' in environ:
             method = environ['REQUEST_METHOD'].upper()
         self.qs_on_post = None
@@ -613,7 +619,8 @@ class FieldStorage:
         if self.qs_on_post:
             qs += '&' + self.qs_on_post
         query = urlparse.parse_qsl(qs, self.keep_blank_values,
-                                   self.strict_parsing, self.max_num_fields)
+                                   self.strict_parsing, self.max_num_fields,
+                                   separator=self.separator)
         self.list = [MiniFieldStorage(key, value) for key, value in query]
         self.skip_lines()
 
@@ -629,7 +636,8 @@ class FieldStorage:
             query = urlparse.parse_qsl(self.qs_on_post,
                                        self.keep_blank_values,
                                        self.strict_parsing,
-                                       self.max_num_fields)
+                                       self.max_num_fields,
+                                       separator=self.separator)
             self.list.extend(MiniFieldStorage(key, value)
                              for key, value in query)
             FieldStorageClass = None
@@ -649,7 +657,7 @@ class FieldStorage:
             headers = rfc822.Message(self.fp)
             part = klass(self.fp, headers, ib,
                          environ, keep_blank_values, strict_parsing,
-                         max_num_fields)
+                         max_num_fields, separator=self.separator)
 
             if max_num_fields is not None:
                 max_num_fields -= 1
diff --git a/Lib/test/test_cgi.py b/Lib/test/test_cgi.py
index 743c2afbd4..f414faa23b 100644
--- a/Lib/test/test_cgi.py
+++ b/Lib/test/test_cgi.py
@@ -61,12 +61,9 @@ parse_strict_test_cases = [
     ("", ValueError("bad query field: ''")),
     ("&", ValueError("bad query field: ''")),
     ("&&", ValueError("bad query field: ''")),
-    (";", ValueError("bad query field: ''")),
-    (";&;", ValueError("bad query field: ''")),
     # Should the next few really be valid?
     ("=", {}),
     ("=&=", {}),
-    ("=;=", {}),
     # This rest seem to make sense
     ("=a", {'': ['a']}),
     ("&=a", ValueError("bad query field: ''")),
@@ -81,8 +78,6 @@ parse_strict_test_cases = [
     ("a=a+b&b=b+c", {'a': ['a b'], 'b': ['b c']}),
     ("a=a+b&a=b+a", {'a': ['a b', 'b a']}),
     ("x=1&y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
-    ("x=1;y=2.0&z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
-    ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
     ("Hbc5161168c542333633315dee1182227:key_store_seqid=400006&cuyer=r&view=bustomer&order_id=0bb2e248638833d48cb7fed300000f1b&expire=964546263&lobale=en-US&kid=130003.300038&ss=env",
      {'Hbc5161168c542333633315dee1182227:key_store_seqid': ['400006'],
       'cuyer': ['r'],
@@ -188,6 +183,30 @@ class CgiTests(unittest.TestCase):
             self.assertEqual(expect[k], v)
         self.assertItemsEqual(expect.values(), d.values())
 
+    def test_separator(self):
+        parse_semicolon = [
+            ("x=1;y=2.0", {'x': ['1'], 'y': ['2.0']}),
+            ("x=1;y=2.0;z=2-3.%2b0", {'x': ['1'], 'y': ['2.0'], 'z': ['2-3.+0']}),
+            (";", ValueError("bad query field: ''")),
+            (";;", ValueError("bad query field: ''")),
+            ("=;a", ValueError("bad query field: 'a'")),
+            (";b=a", ValueError("bad query field: ''")),
+            ("b;=a", ValueError("bad query field: 'b'")),
+            ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}),
+            ("a=a+b;a=b+a", {'a': ['a b', 'b a']}),
+        ]
+        for orig, expect in parse_semicolon:
+            env = {'QUERY_STRING': orig}
+            fs = cgi.FieldStorage(separator=';', environ=env)
+            if isinstance(expect, dict):
+                for key in expect.keys():
+                    expect_val = expect[key]
+                    self.assertIn(key, fs)
+                    if len(expect_val) > 1:
+                        self.assertEqual(fs.getvalue(key), expect_val)
+                    else:
+                        self.assertEqual(fs.getvalue(key), expect_val[0])
+
     def test_log(self):
         cgi.log("Testing")
 
diff --git a/Lib/test/test_urlparse.py b/Lib/test/test_urlparse.py
index 86c4a0595c..0b2107339a 100644
--- a/Lib/test/test_urlparse.py
+++ b/Lib/test/test_urlparse.py
@@ -24,16 +24,20 @@ parse_qsl_test_cases = [
     ("&a=b", [('a', 'b')]),
     ("a=a+b&b=b+c", [('a', 'a b'), ('b', 'b c')]),
     ("a=1&a=2", [('a', '1'), ('a', '2')]),
-    (";", []),
-    (";;", []),
-    (";a=b", [('a', 'b')]),
-    ("a=a+b;b=b+c", [('a', 'a b'), ('b', 'b c')]),
-    ("a=1;a=2", [('a', '1'), ('a', '2')]),
-    (b";", []),
-    (b";;", []),
-    (b";a=b", [(b'a', b'b')]),
-    (b"a=a+b;b=b+c", [(b'a', b'a b'), (b'b', b'b c')]),
-    (b"a=1;a=2", [(b'a', b'1'), (b'a', b'2')]),
+    (b"", []),
+    (b"&", []),
+    (b"&&", []),
+    (b"=", [(b'', b'')]),
+    (b"=a", [(b'', b'a')]),
+    (b"a", [(b'a', b'')]),
+    (b"a=", [(b'a', b'')]),
+    (b"&a=b", [(b'a', b'b')]),
+    (b"a=a+b&b=b+c", [(b'a', b'a b'), (b'b', b'b c')]),
+    (b"a=1&a=2", [(b'a', b'1'), (b'a', b'2')]),
+    (";a=b", [(';a', 'b')]),
+    ("a=a+b;b=b+c", [('a', 'a b;b=b c')]),
+    (b";a=b", [(b';a', b'b')]),
+    (b"a=a+b;b=b+c", [(b'a', b'a b;b=b c')]),
 ]
 
 parse_qs_test_cases = [
@@ -57,16 +61,10 @@ parse_qs_test_cases = [
     (b"&a=b", {b'a': [b'b']}),
     (b"a=a+b&b=b+c", {b'a': [b'a b'], b'b': [b'b c']}),
     (b"a=1&a=2", {b'a': [b'1', b'2']}),
-    (";", {}),
-    (";;", {}),
-    (";a=b", {'a': ['b']}),
-    ("a=a+b;b=b+c", {'a': ['a b'], 'b': ['b c']}),
-    ("a=1;a=2", {'a': ['1', '2']}),
-    (b";", {}),
-    (b";;", {}),
-    (b";a=b", {b'a': [b'b']}),
-    (b"a=a+b;b=b+c", {b'a': [b'a b'], b'b': [b'b c']}),
-    (b"a=1;a=2", {b'a': [b'1', b'2']}),
+    (";a=b", {';a': ['b']}),
+    ("a=a+b;b=b+c", {'a': ['a b;b=b c']}),
+    (b";a=b", {b';a': [b'b']}),
+    (b"a=a+b;b=b+c", {b'a':[ b'a b;b=b c']}),
 ]
 
 class UrlParseTestCase(unittest.TestCase):
diff --git a/Lib/urlparse.py b/Lib/urlparse.py
index 798b467b60..6c32727fce 100644
--- a/Lib/urlparse.py
+++ b/Lib/urlparse.py
@@ -382,7 +382,8 @@ def unquote(s):
             append(item)
     return ''.join(res)
 
-def parse_qs(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
+def parse_qs(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None,
+             separator='&'):
     """Parse a query given as a string argument.
 
         Arguments:
@@ -402,17 +403,22 @@ def parse_qs(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
 
         max_num_fields: int. If set, then throws a ValueError if there
             are more than n fields read by parse_qsl().
+
+        separator: str. The symbol to use for separating the query arguments.
+            Defaults to &.
+
     """
     dict = {}
     for name, value in parse_qsl(qs, keep_blank_values, strict_parsing,
-                                 max_num_fields):
+                                 max_num_fields, separator=separator):
         if name in dict:
             dict[name].append(value)
         else:
             dict[name] = [value]
     return dict
 
-def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
+def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None,
+              separator='&'):
     """Parse a query given as a string argument.
 
     Arguments:
@@ -432,17 +438,23 @@ def parse_qsl(qs, keep_blank_values=0, strict_parsing=0, max_num_fields=None):
     max_num_fields: int. If set, then throws a ValueError if there
         are more than n fields read by parse_qsl().
 
+        separator: str. The symbol to use for separating the query arguments.
+            Defaults to &.
+
     Returns a list, as G-d intended.
     """
+    if not separator or (not isinstance(separator, (str, bytes))):
+        raise ValueError("Separator must be of type string or bytes.")
+
     # If max_num_fields is defined then check that the number of fields
     # is less than max_num_fields. This prevents a memory exhaustion DOS
     # attack via post bodies with many fields.
     if max_num_fields is not None:
-        num_fields = 1 + qs.count('&') + qs.count(';')
+        num_fields = 1 + qs.count(separator)
         if max_num_fields < num_fields:
             raise ValueError('Max number of fields exceeded')
 
-    pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
+    pairs = [s1 for s1 in qs.split(separator)]
     r = []
     for name_value in pairs:
         if not name_value and not strict_parsing:
diff --git a/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst
new file mode 100644
index 0000000000..f08489b414
--- /dev/null
+++ b/Misc/NEWS.d/next/Security/2021-02-14-15-59-16.bpo-42967.YApqDS.rst
@@ -0,0 +1 @@
+Fix web cache poisoning vulnerability by defaulting the query args separator to ``&``, and allowing the user to choose a custom separator.
-- 
2.30.1

Arch py2-ize-the-CJK-codec-test.patch which clearly originates from gentoo.

From ed1aa2f4738efe948242f252bcb0aa0b4314d2a2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Micha=C5=82=20G=C3=B3rny?= <mgorny@gentoo.org>
Date: Fri, 5 Mar 2021 10:34:50 +0100
Subject: py2-ize the CJK codec test
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Michał Górny <mgorny@gentoo.org>
---
 Lib/test/multibytecodec_support.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/Lib/test/multibytecodec_support.py b/Lib/test/multibytecodec_support.py
index b7d7a3aba7..661ef9ee37 100644
--- a/Lib/test/multibytecodec_support.py
+++ b/Lib/test/multibytecodec_support.py
@@ -2,6 +2,7 @@
 #   Common Unittest Routines for CJK codecs
 #
 
+import binascii
 import codecs
 import os
 import re
@@ -280,7 +281,7 @@ class TestBase_Mapping(unittest.TestCase):
 
     def _test_mapping_file_plain(self):
         def unichrs(s):
-            return ''.join(chr(int(x, 16)) for x in s.split('+'))
+            return ''.join(unichr(int(x, 16)) for x in s.split('+'))
 
         urt_wa = {}
 
@@ -294,7 +295,7 @@ class TestBase_Mapping(unittest.TestCase):
 
                 if data[0][:2] != '0x':
                     self.fail("Invalid line: {line!r}".format(line=line))
-                csetch = bytes.fromhex(data[0][2:])
+                csetch = binascii.a2b_hex(data[0][2:])
                 if len(csetch) == 1 and 0x80 <= csetch[0]:
                     continue
 
-- 
cgit v1.2.3